»ùÓÚQG¹Î¹ÎÀÖƽ̨µÄOpenStackÔÆϵͳµ÷ÓÅÖ¸ÄÏ

ΪÈÃʹÓÃQG¹Î¹ÎÀÖƽ̨ЧÀÍÆ÷µÄÓû§Äܹ»¸ü¿ì½ÝµÄ¹¹½¨IaaSϵͳ£¬³ä·ÖÑéÕ¹QG¹Î¹ÎÀÖƽ̨µÄÐÔÄÜ£¬Ìرàд±¾Ö¸ÄÏ£¬¹©Óû§²Î¿¼¡£±¾Ö¸Äϲ¢²»ÒÀÀµÌض¨²úÆ·ÐͺÅ£¬ÖØÔÚÖ¸µ¼Óû§ÔÚQG¹Î¹ÎÀÖƽ̨ЧÀÍÆ÷Éϵĵ÷ÊÔÒªÁ죬²¢ÇÒ»áËæ×ÅÆÀ¹À²âÊÔµÄÉîÈëºÍ³ÉÊì¶ÈÔöÌíÖð²½ÍêÉÆ¡£


Ò»¡¢ÏµÍ³ÏÈÈÝ


1.Ó²¼þÇéÐΣº

²âÊÔÓÃIaaSϵͳ¹æģΪ46¸ö½Úµã£¬¾ùʹÓÃQG¹Î¹ÎÀÖƽ̨˫·ЧÀÍÆ÷¹¹½¨¡£ÆäÖÐOpenStack½Úµã40¸ö£¬CephÂþÑÜʽ´æ´¢ÏµÍ³½Úµã6¸ö£¬Ïê¼ûϱí¡£


½ÚµãÀàÐÍ

½ÚµãÉèÖÃ

ÊýÄ¿

ÖÎÀí½Úµã

128G DDR4ÄÚ´æ+120G SSD+ǧÕ×ÍøÂçx2

3

ÍøÂç½Úµã

128G DDR4ÄÚ´æ+120G SSD+ǧÕ×ÍøÂçx3

1

¿é´æ´¢½Úµã

128G DDR4ÄÚ´æ+120G SSD+ǧÕ×ÍøÂçx2

1

¹¤¾ß´æ´¢½Úµã

128G DDR4ÄÚ´æ+120G SSD+4T HDDx3+ǧÕ×ÍøÂçx2

1

ÅÌËã½Úµã

128G DDR4ÄÚ´æ+120G SSD+ǧÕ×ÍøÂçx2

33

¼à¿Ø½Úµã

128G DDR4ÄÚ´æ+120G SSD+ǧÕ×ÍøÂçx2

1

Ceph Mon½Úµã

128G DDR4ÄÚ´æ+120G SSD+ÍòÕ×ÍøÂçx2

3

Ceph OSD½Úµã

128G DDR4ÄÚ´æ+250G SSD+500G SSDx3+2T HDDx12+ÍòÕ×ÍøÂçx4

3

Rally Server

Dell E3-1220v3ЧÀÍÆ÷

1


2.Èí¼þÇéÐΣº

Èí¼þ

°æ±¾

±¸×¢

Host OS

CentOS7.6

×°ÖÃZX Patch v3.0.9.4

Python

v2.7.5


Docker

v19.03.8


OpenStack

Sterin

»ùÓÚÈÝÆ÷°²ÅÅ

Mariadb

v10.3.10

¶àÖ÷Golare Cluster

SQLAlchemy

v0.12.0

python module

RabbitMQ

v3.8.14

´îÅäErlang v22.3.4.21

Haproxy

v1.5.18


KeepAlived

v1.3.5


Ceph

v14.2.19


Rally

v2.0

²¢·¢²âÊÔ¹¤¾ß

CosBench

v0.4.2

¹¤¾ß´æ´¢ÐÔÄܲâÊÔ¹¤¾ß

TestPerf

v2.16.0

ÐÂÎÅÐÐÁÐÐÔÄܲâÊÔ¹¤¾ß


3.´æ´¢¼Æ»®

¸÷OpenStack½ÚµãµÄÍâµØ´ÅÅÌÖ»×öϵͳʹÓá£ÌṩӦÓû§µÄ´æ´¢×ÊԴͳһʹÓÃCeph×÷Ϊºó¶Ë¡£¼´nova¡¢glance¡¢cinder-volumeºÍcinder-backupÒÔ¼°manilaËùÓÐʹÓÃCeph×öºó¶Ë´æ´¢¡£»ùÓÚQG¹Î¹ÎÀÖƽ̨µÄCeph¼¯Èº°²Åź͵÷ÓÅÒªÁìÇë°Ý¼û¹ÙÍøÒÑÐû²¼µÄ¡¶»ùÓÚQG¹Î¹ÎÀÖƽ̨°²ÅÅCephÂþÑÜʽ´æ´¢ÏµÍ³µÄ×î¼Ñʵ¼ù¡·¡£


°²ÅÅÁËSwift¹¤¾ß´æ´¢Ð§ÀÍ£¬Ê¹ÓÃÍâµØ´ÅÅÌ£¬µ«Ã»ÓÐÉèÖøøÆäËû×é¼þʹÓá£ÆäÐÔÄÜÒ²²»°üÀ¨±¾ÎÄÌÖÂÛ¹æÄ£Ö®ÄÚ¡£


4.×éÍø¼Æ»®

ƾ֤QG¹Î¹ÎÀÖƽ̨ЧÀÍÆ÷°åÔØÍø¿¨ÉèÖÃÇéÐΡ¢ÍýÏëÓªÒµ¹æÄ£ºÍÏÖÓÐÍøÂçÇéÐεÈÌõ¼þ£¬ÍýÏ뼯ȺÍøÂçÈçÏÂͼËùʾ¡£½«Âß¼­ÉϵÄÎåÀàÍøÂçËõ¶Ì³ÉÈý¸öÎïÀíÍøÂçʵÑé°²ÅÅ£¬¼´ÖÎÀíºÍTunnelÍøÂ繫ÓÃÒ»¸öÎïÀíÍøÂç¡¢°²Åźʹ洢ÍøÂ繫ÓÃÒ»¸öÎïÀíÍøÂçÒÔ¼°ÍâÍø¡£


- °²ÅÅÍøÂ磺ÓÃÓÚPXE boot¼°×°ÖÃÈí¼þʱ»á¼ûÍâµØÈí¼þÔ´¾µÏñ£»

- ÖÎÀíÍøÂ磺ÓÃÓÚ¸÷½ÚµãÖ®¼äͨ¹ýAPI»á¼ûÒÔ¼°SSH»á¼û£»

- TunnelÍøÂ磺ÓÃÓÚ¸÷ÅÌËã½ÚÉÏÐéÄâ»ú»¥Í¨ÒÔ¼°ºÍÍøÂç½ÚµãµÄÅþÁ¬£¬Ö÷Òª³ÐÔØÓªÒµµÄ¹¤¾ßÁ÷Á¿£»

- ´æ´¢ÍøÂ磺ÓÃÓÚ»á¼ûͳһ´æ´¢ºó¶Ë£»


»ùÓÚQG¹Î¹ÎÀÖƽ̨µÄOpenStackÔÆϵͳµ÷ÓÅÖ¸ÄÏ

ͼ1 ÍøÂçÍØÆË

¶þ¡¢µ÷ÓÅÕ½ÂÔºÍÊÖ¶Î


1.ÐÔÄÜÖ¸±ê

ÓÉÓÚ±¾ÎÄÖ÷ÒªÉæ¼°IaaSƽ̨ÐÔÄÜ£¬²¢²»°üÀ¨ÐéÄâÖ÷»úµÄÐÔÄÜ£¬Òò´Ë²âÊԺ͵÷ÓÅÊÂÇéÖ÷ÒªÔÚÕë¶ÔOpenStackÒªº¦×é¼þµÄÐÔÄܲâÊÔ¡£ÎÒÃǹØ×¢µÄÐÔÄÜÖ¸±êÓУºµ¥¸öÇëÇóµÄÍê³Éʱ¼ä¡¢ÅúÁ¿ÇëÇóµÄ95%Íê³Éʱ¼äºÍÅúÁ¿ÇëÇóµÄβ¶ËÑÓ³Ù¡£


2.²âÊÔÒªÁì

»ñµÃÓÅ»¯²ÎÊýµÄÁ÷³ÌÖ÷ÒªÊDzâÊÔ¡¢ÆÊÎö¡¢µ÷ÓÅ¡¢ÔÙ²âÊÔ¡£


- ²âÊÔ¹¤¾ß

¹ØÓÚOpenStack×é¼þÎÒÃÇʹÓÃRallyÀ´¾ÙÐÐÅúÁ¿²âÊÔ£¬¿ÉÒÔʹÓÃRallyÖаüÀ¨µÄ²âÊÔÓÃÀýÒ²¿ÉÒÔƾ֤ÐèÒª×Ô½ç˵²âÊÔÓÃÀý¡£RallyµÄ²âÊÔ±¨¸æÖлáÓÐÏêϸµÄºÄʱͳ¼Æ¡£±¾²âÊÔÅúÁ¿ÇëÇóµÄ²âÊÔ¹æÄ£ÊÇ200²¢·¢£¬±»²âÅÌËã½ÚµãÊ®¸ö£¬Ê¹ÓõÄGuest OSÊÇCirros¡£


¹ØÓÚµ¥¸öÇëÇóµÄ²âÊÔ»¹¿ÉÒÔʹÓÃOSprofiler¡£OpenstackÖеÄÇëÇó´ó¶¼¶¼ÐèÒª¾­Óɶà¸ö×é¼þ´¦Öóͷ£²Å¿ÉÒÔÍê³É£¬¸Ã¹¤¾ßÖ÷ÒªÓÃ;ÊÇЭÖúÆÊÎöÇëÇóµÄ´¦Öóͷ£Àú³Ì£¬ÔÚ²¢·¢²âÊÔÇ°ÕÒ³ö¿ÉÄܵĺÄʱµã£¬´Ó¶øÌáÇ°ÓÅ»¯¡£


¹ØÓÚRabbitMQÎÒÃǽÓÄÉÁËTestPerf¹¤¾ß£¬²¢Éè¼ÆÁ˼¸Öֵ䷶ÓÃÀýÀ´¾ÙÐÐÐÔÄܲâÊÔ¡£Öصã¹Ø×¢ÐÂÎÅÐÐÁеÄÍÌÍÂÁ¿ºÍÐÂÎÅͶµÝµÄÑÓ³Ùʱ¼ä¡£


- ÆÊÎöÒªÁì

Ö÷Òª´Ó²âÊÔЧ¹ûºÍÈÕÖ¾ÈëÊÖ£¬ÓÅÏȽâ¾öÈÕÖ¾Öеĸ澯ºÍ±¨´íÊÂÎñ£¬ÀýÈç˲ʱ¸ß²¢·¢ÇëÇóµ¼ÖµÄЧÀÍ»òÊý¾Ý¿âÅþÁ¬³¬Ê±¡¢Êý¾Ý¿âÅÌÎÊʧ°Ü¡¢×ÊԴȱ·¦µÈÎÊÌâ¡£


Æä´ÎÊÇ»ùÓÚ¶ÔÔ´´úÂëµÄÃ÷ȷͨ¹ýÌí¼Óʱ¼ä´Á»ñÈ¡²âÊÔÖеÄÓªÒµÁ÷»òÊý¾ÝÁ÷Òªº¦½ÚµãµÄºÄʱÊý¾Ý£¬ÔÙÖð¸öÆÊÎö¸ßºÄʱµãµÄºÄʱԵ¹ÊÔ­ÓÉ¡£


ÉÐÓÐÒ»ÖÖ³£ÓÃÒªÁìÊÇÔÚ±ê׼ƽ̨ÉÏ×ö¶Ô±ê²âÊÔ¡£ÔÚÎÊÌâÊÕÁ²µ½Ò»¶¨Ë®Æ½ºóÈÔÎÞ·¨Ú¹ÊÍÐÔÄÜÎÊÌâµÄÔµ¹ÊÔ­ÓÉʱ»òÕßÐÔÄÜÎÊÌâµÄÔµ¹ÊÔ­ÓÉ¿ÉÄÜÊǺÍÓ²¼þ¼Ü¹¹»òϵͳ½á¹¹ÓйØʱ£¬Ê¹ÓøÃÒªÁìÀ´ÑéÖ¤¡£


- ÓÅ»¯ÊÖ¶Î

¹ØÓÚOpenStack¿ØÖƲãÃæµÄÓÅ»¯Æ«ÏòÖ÷ÒªÓÐÈý¸ö£ºÌá¸ß²¢·¢ÀÖ³ÉÂÊ¡¢Ëõ¶Ìƽ¾ùÍê³Éʱ¼äºÍ½µµÍβ²¿ÑÓ³Ù¡£


ʹÓÃOpenStackĬÈÏÉèÖÃ×ö²¢·¢²âÊԻᷢÃ÷£¬Æ¾Ö¤ÏêϸµÄ±»²â¹¦Ð§²î±ð£¬200²¢·¢µÄÀÖ³ÉÂʲî±ðºÜ´ó£¬Í¨³£±»²â¹¦Ð§Éæ¼°×é¼þÔ½¶àÇÒ²¢·¢ÊýÄ¿Ô½´óÀÖ³ÉÂÊÔ½µÍ¡£Ìá¸ßÀÖ³ÉÂʵÄÒªÁìÖ÷Òª¿¿ÌáÉýÓ²¼þÐÔÄܺ͵÷½â¸÷×é¼þµÄÉèÖòÎÊý¡£ÔÚQG¹Î¹ÎÀÖƽ̨ÉÏ£¬Ó²¼þÐÔÄܵÄÌáÉý³ýÁËÌá¸ßÒªº¦Ó²¼þ£¨ÄÚ´æºÍÍø¿¨£©µÄÐÔÄÜÍâ¸üÖ÷ÒªµÄÓÐÁ½µã£ºÒ»ÊÇЧÀÍÆ÷ϵͳҪ´òÉÏQG¹Î¹ÎÀÖBSP Patch£»¶þÊÇϵͳ°²Åżƻ®¿ª·¢½×¶Î¾ÍҪƾ֤QG¹Î¹ÎÀÖƽ̨µÄNUMA ÍØÆ˽ṹÀ´Ë¼Á¿ºÏÀíµÄÇ׺ÍÉèÖã¬Ö»¹Ü×èÖ¹±¬·¢¿çNUMA»á¼û£¬ÈôÎÞ·¨×èÖ¹¿çNUMA»á¼û£¬ÔòÖ»¹ÜʹÓÃÏàÁÚµÄnode£¬×èÖ¹¿çSocket»á¼û¡£¶øÔÚ×é¼þÉèÖÃÉÏÔòÐèҪƾ֤¶ÔÇëÇóµÄ´¦Öóͷ£Â·¾¶µÄ¸ú×Ù£¬ÕÒµ½ºÄʱ½Ï³¤µÄ¹¦Ð§¡£ÓÉÓÚËæ²¢·¢Á¿ÔöÌí¾­³£»áµ¼ÖÂ×é¼þ´¦Öóͷ£µÄRetryºÍTimeout£¬ÕâЩ»á½øÒ»²½µ¼Ö±¬·¢ÇëÇóʧ°Ü£¬ÔÚÎÞ·¨ÌáÉý×é¼þ´¦Öóͷ£Ð§ÂÊʱӦÊʵ±ÔöÌíRetry´ÎÊý»òTimeoutʱ¼äÀ´×èÖ¹ÇëÇóÖ±½Óʧ°Ü¡£


ΪËõ¶Ìƽ¾ùÍê³Éʱ¼ä£¬³ýÁËÇ°ÃæÒÑÌáµ½ÔöÌíÓ²¼þƽ̨ÐÔÄÜÍ⣬¾ÍÒªÇåÎúµÄÕÒµ½ÖÖÖÖdelayÇëÇó´¦Öóͷ£µÄÏêϸ½×¶Î¡£¸ÃÇéÐÎÏ¿ÉʹÓÃÖÖÖÖtraceÊÖ¶ÎÀ´ÊáÀí¿ØÖÆÁ÷ºÍÊý¾ÝÁ÷£¬Í¨Ì«¹ýÎöÈÕÖ¾µÈÒªÁìͳ¼Æ¸÷´¦Öóͷ£½×¶ÎµÄºÄʱ¡£ÕÒµ½Òªº¦ºÄʱ¹¦Ð§ºó£¬¿Éµ÷ÓŵÄÊÖ¶ÎÓУº


1¡¢ÐÞ¸Ä×é¼þÉèÖã¬ÔöÌí´¦Öóͷ£Ïß³ÌÊý£¬³ä·ÖʹÓöàºËÐÔÄÜ£»

2¡¢×é¼þÐÔÄÜÒÀÀµ²Ù×÷ϵͳÉèÖõÄ£¬Ë¼Á¿µ÷½âϵͳÉèÖÃÀ´ÓÅ»¯£¬Ïà¹ØÉèÖÃÓÅ»¯ÒªÁìÇë²Î¿¼¶ÔÓ¦²Ù×÷ϵͳÌṩµÄµ÷ÓÅÎĵµÒÔ¼°QG¹Î¹ÎÀÖ¹Ù·½ÌṩµÄÆäËûЧÀÍÆ÷²úÆ·Ïà¹Øµ÷ÓÅÎĵµ£»

3¡¢×é¼þ¶Ôͳһ¹¦Ð§¾­³£¿ÉÒÔÌṩ¶àÖÖ·½·¨£¬ÐÞ¸ÄÉèÖã¬Ê¹Óøü¸ßЧµÄ·½·¨£»

4¡¢´ÓÉçÇø²éÕÒ¶ÔÇëÇó´¦Öóͷ£ÐÔÄÜÓÐÌá¸ßµÄPatch£¬ÓÐЩ´¦Öóͷ£Ð§Âʵ͵ÄÔµ¹ÊÔ­ÓÉÊÇʵÏÖ·½·¨×Ô¼º¾ÍЧÂʵÍ£¬Í¨¹ýPatch¿ÉÄÜÓÐÓÃÌá¸ß´¦Öóͷ£ÄÜÁ¦¡£ÈôÊÇûÓпÉÓÃPatch£¬Ö»ÄÜ×ÔÐпª·¢¡£

5¡¢Æ¾Ö¤PythonÓïÑÔÌØÕ÷¾ÙÐÐÓÅ»¯£¬×èÖ¹Ïß³ÌÛÕ±Õ£»

6¡¢ÓÅ»¯×é¼þ°²Åżƻ®£¬Æ¾Ö¤²¢·¢Ñ¹Á¦Êʵ±ÔöÌí×é¼þÊýÄ¿»ò½ÚµãÊýÄ¿£¬Ò²¿Éͨ¹ý×é¼þǨáãƽºâ¸÷½ÚµãµÄ¸ºÔØѹÁ¦£»

7¡¢¸üÐÂЧÀÍÆ÷OS Äں˹¦Ð§Patch£¬Ìṩ¸ßÄں˴¦Öóͷ£ÄÜÁ¦£»

8¡¢¹ØÓÚͨÓÃÈí¼þ£¬ÈçÐÂÎÅÐÐÁУ¬Êý¾Ý¿âºÍ¸ºÔØƽºâµÈ£¬»¹¿ÉÒÔÑ¡ÔñÉý¼¶°æ±¾»òÑ¡ÓÃÐÔÄܸüºÃµÄͬÀàÈí¼þÀ´ÊµÏÖÐÔÄܵÄÌá¸ß¡£


±¾ÎĺóÐøÕ½ÚÖÐÏÈÈݵÄÍƼö²ÎÊý¾ùƾ֤²âÊÔЧ¹ûµÃ³ö¡£


Çë×¢ÖØ£¬²âÊÔÓÃÀý²î±ð¶Ô¸÷²ÎÊýµÄÖµÓ°ÏìºÜ´ó£¬ºÃ±È²âÊÔ200²¢·¢½¨ÉèÐéÄâ»úʱʹÓõÄÐéÄâ»ú¾µÏñ´ÓCirros¸ÄΪUbuntu£¬ÄÇôºÜ¿ÉÄÜÐí¶àµØ·½µÄ³¬Ê±Ê±¼äºÍRetry´ÎÊý¶¼ÐèÒªÔöÌí²Å»ª°ü¹ÜÀÖ³ÉÂÊ100%¡£Òò´Ë£¬±¾ÎÄÖÐÍƼöÖµ²»ÊÊÓÃÓÚÉú²úÇéÐΣ¬½öÓÃ×÷µ÷ÓŲο¼¡£


Èý¡¢OpenStackÒªº¦×é¼þÉèÖÃ


1. Êý¾Ý¿â

OpenStackϵͳÖÐÊý¾Ý¿âÊÇÊ®·ÖÒªº¦µÄЧÀÍ¡£¸÷×é¼þ¶¼ÔÚÊý¾Ý¿âЧÀÍÖÐÓÐ×Ô¼ºµÄÊý¾Ý¿â£¬ÓÃÓÚÉúÑÄЧÀÍ¡¢×ÊÔ´ºÍÓû§µÈÏà¹ØÊý¾Ý¡£Êý¾Ý¿â»¹±»ÓÃ×÷ÖÖÖÖ×é¼þ¼äЭͬµÄÒ»ÖÖ»úÖÆ¡£Òò´Ë£¬ÔÚ´¦Öóͷ£ÖÖÖÖÇëÇóµÄÀú³ÌÖлò¶à»òÉÙ¶¼»áÉæ¼°µ½Êý¾Ý¿âµÄ¶Áд²Ù×÷¡£


ÔÚÔÆƽ̨µÄÐÔÄܵ÷ÓÅÀú³ÌÖÐÐèÒªÖصãÊÓ²ìÊý¾Ý¿âÇëÇóÏìӦʱ¼ä¡£¹ØÓÚÔËÐÐʱÊý¾Ý¿âÇëÇóÏìӦʱ¼ä¹ý³¤ÎÊÌâͨ³£ÊǽÏÁ¿ÖØ´óµÄ£¬Éæ¼°Êý¾Ý¿âЧÀÍÆ÷/cluster¡¢ÊðÀíЧÀÍÆ÷ºÍ¿Í»§¶Ë£¨¼´×é¼þ¶Ë£©£¬ÐèÒªÖðÒ»ÅŲ飬²éÕÒÎÊÌâÔ´Í·²¢×ö¶ÔÓ¦µ÷½â¡£±¾½ÚÖ÷ÒªÏÈÈÝЧÀÍÆ÷¶ËºÍ¿Í»§¶ËµÄÏà¹ØÉèÖòÎÊý£¬ÊðÀíЧÀÍÆ÷¶Ë²Î¿¼ºóÐøÕ½Ú¡£


1.1 Mariadb


²âÊÔÓÃϵͳÖÐÒÔÈý¸öMariadb½Úµã¹¹½¨ÁËÒ»¸ö¶àÖ÷ģʽµÄGolare Cluster£¬Ç°¶Ëͨ¹ýhaproxyʵÏÖÖ÷±¸¸ß¿ÉÓá£Êý¾Ý¿âµÄµ÷ÓÅÒªÁì¿ÉÒԲο¼¡¶MySQLÓÅ»¯Êֲᡷ£¬Ôö²¹Ò»¸öÔÚOpenStackÔÆϵͳÖÐÐèÒªÌØÊâ×¢ÖصIJÎÊý¡£


- max_allowed_packet

¸Ã²ÎÊýÓÃÓÚÉèÖÃMariaDB ЧÀÍÆ÷¶ËÔÊÐíÎüÊÕµÄ×î´óÊý¾Ý°ü¾Þϸ¡£ÓÐʱ¼ä´óµÄ²åÈëºÍ¸üвÙ×÷»áÒòmax_allowed_packet ²ÎÊýÉèÖùýСµ¼ÖÂʧ°Ü¡£


- ÍƼöÉèÖãº

ĬÈÏÖµ1024£¬µ¥Î»Kbyte¡£ÔÚOpen StackµÄ¼¯ÈºÖУ¬ÓÐÁè¼ÝĬÈÏÖµ¾ÞϸµÄ°ü£¬ÐèÒªÊʵ±ÔöÌí¾Þϸ¡£

ÐÞ¸ÄmariadbµÄÉèÖÃÎļþgalera.cnf

[mysqld]

max_allowed_packet = 64M

ÖØÆômariadbЧÀÍÉúЧ¡£


1.2 oslo_db

¸÷×é¼þ»á¼ûÊý¾Ý¿âÊÇͨ¹ýŲÓÃoslo_dbÀ´ÊµÏֵģ¬ÏÖʵ°²ÅÅÖÐÎÒÃÇÉèÖÃoslo_db+SQLAlchemy×÷ΪÊý¾Ý¿â»á¼ûµÄ¿Í»§¶Ë£¬Òò´Ë£¬Êý¾Ý¿â¿Í»§¶ËµÄÏà¹ØÉèÖòÎÊýºÍµ÷ÓÅÊֶοÉÒԲο¼SQLAlchemyµÄ¹Ù·½Îĵµ¡£×¢ÖØ£¬SQLAlchemyµÄ¸ß°æ±¾»á´øÀ´ÐÔÄÜÌáÉý£¬µ«²»¿ÉËæÒâÉý¼¶£¬ÐèҪ˼Á¿°æ±¾¼æÈÝ¡£


- slave_connection

Stein°æoslo_dbÖ§³ÖÉèÖÃslave_connection£¬²¢ÇÒÒѾ­Óв¿·Ö×é¼þÖ§³Ö°´¶ÁдÊèÉ¢·½·¨»á¼ûÊý¾Ý¿â£¬ÀýÈçnova£¬¼´Ð´Êý¾Ý¿â²Ù×÷ʹÓÃconnection£¬¶ÁÊý¾Ý¿â²Ù×÷ʹÓÃslave_connection²Ù×÷¡£´Ó¶ø¸ÄÉƶÁ²Ù×÷µÄÐÔÄÜ¡£


²Î¿¼ÉèÖÃÒªÁ죺

1¡¢Í¨¹ýhaproxyÀ´Îª¶Áд²Ù×÷Ìṩ²î±ðµÄת·¢Èë¿Ú£¬Èçд²Ù×÷ʹÓÃ3306£¬×ª·¢Ä£Ê½ÉèÖÃΪһÖ÷Á½±¸£»¶Á²Ù×÷ʹÓÃ3307£¬×ª·¢·½·¨ÉèÖÃΪ°´×îСÅþÁ¬Êý¡£

2¡¢ÎªÖ§³Ö¶ÁдÊèÉ¢µÄ×é¼þÔÚÆäÉèÖÃÎļþµÄ¡¾database¡¿¶ÎÖÐÔöÌíslave_connectionÉèÖá£


×¢ÖØ£ºÓÐЩ×é¼þËäÈ»¿ÉÒÔÉèÖÃslave_connectionµ«Æä´úÂëÊÂÎñÖÐÏÖʵÉϲ¢Ã»ÓÐÔÚ¶ÁÊý¾Ý¿âʱŲÓÃslave_connection£¬Ðèƾ֤Ïêϸ°æ±¾×ÐϸȷÈÏ¡£


- ÅþÁ¬³Ø

ºÍÊý¾Ý¿â½¨ÉèÅþÁ¬ÊÇÒ»¸öÏà¶ÔºÄʱµÄÀú³Ì£¬Òò´Ë£¬ÌṩÁËÅþÁ¬³Ø»úÖÆ£¬ÅþÁ¬³ØÖеÄÅþÁ¬¿ÉÒÔ±»Öظ´Ê¹Óã¬ÒÔÌá¸ßЧÂÊ¡£Óû§ÔÚµ÷ÊÔÀú³ÌÖÐÓ¦¹Ø×¢¸÷×é¼þºÍÊý¾Ý¿âÖ®¼äµÄÅþÁ¬ÇéÐΣ¬Èç×é¼þµÄÊý¾Ý¿âÇëÇóÏìӦʱ¼äºÍ×é¼þÈÕÖ¾ÖÐÊý¾Ý¿âÅþÁ¬Ïà¹ØÐÅÏ¢µÈ¡£Èô·¢Ã÷Êý¾Ý¿âÇëÇóÏìӦʱ¼ä¹ý³¤ÇÒ¾­ÅŲéºóÏÓÒÉÊÇ×é¼þ¶ËÔÚÅþÁ¬Êý¾Ý¿âÉϺÄʱ¹ý³¤Ê±£¬¿Éͨ¹ýÒÔϲÎÊýʵÑéÓÅ»¯¡£ÔÚ±¾ÎÄËùÊö²âÊÔÀú³ÌÖÐʹÓÃÁËĬÈϲÎÊýÉèÖã¬Óû§Ðèƾ֤ÔËÐÐʱÇéÐξÙÐе÷ÓÅ¡£ÅþÁ¬³ØÖ÷ÒªÉèÖòÎÊýÈçÏ£º


? ? min_pool_size £ºÅþÁ¬³ØÖÐÒÑÅþÁ¬µÄSQLÅþÁ¬Êý²»µÃСÓÚ¸ÃÖµ£¬Ä¬ÈÏÖµÊÇ1¡£

? ? max_pool_size £ºÅþÁ¬³ØÖÐÒÑÅþÁ¬µÄ×î´óSQLÅþÁ¬Êý£¬Ä¬ÈÏÖµÊÇ5£¬ÉèÖÃ0ʱÎÞÏÞÖÆ¡£

? ? max_overflow £º×î´óÔÊÐíÁè¼Ý×î´óÅþÁ¬ÊýµÄÊýÄ¿£¬Ä¬ÈÏÖµÊÇ50¡£

? ? pool_timeout £º´ÓÅþÁ¬³ØÀï»ñÈ¡ÅþÁ¬Ê±ÈôÊÇÎÞ¿ÕÏеÄÅþÁ¬£¬ÇÒÅþÁ¬ÊýÒѾ­µÖ´ïÁËmax_pool_size+max_overflow£¬ÄÇôҪ»ñÈ¡ÅþÁ¬µÄÀú³Ì»áÆÚ´ýpool_timeoutÃ룬ĬÈÏÖµÊÇ30s£¬ÈôÊÇÁè¼ÝÕâ¸öʱ¼ä»¹Ã»ÓлñµÃÅþÁ¬½«»áÅ׳öÒì³£¡£ÈôÊÇ·ºÆð¸ÃÒì³££¬¿ÉÒÔ˼Á¿ÔöÌíÅþÁ¬³ØµÄÅþÁ¬Êý¡£


2. ÐÂÎÅÐÐÁÐ


2.1 Rabbitmq

²âÊÔÓÃϵͳÖÐÒÔÈý½Úµã¹¹½¨ÁËÒ»¸öRabbitMQ Mirror Queue Cluster£¬ËùÓнڵã¾ùΪdisk½Úµã¡£


RabbitMQµÄÖ÷ÒªÐÔÄÜÎÊÌâÊÇÐÂÎÅͶµÝµÄÑÓ³Ùʱ¼ä£¬µ«¹ØÓÚclusterµÄ×éÖ¯·½·¨£¬ÆäÑÓ³Ùʱ¼äÖ÷ÒªÏûºÄÔÚ¾µÏñÐÐÁÐÖ®¼äµÄÊý¾ÝÒ»ÖÂÐÔ°ü¹Ü´¦Öóͷ£ÉÏ£¬¶øÕâÖÖ´¦Öóͷ£Àú³ÌÊ®·ÖÖش󣬵÷ÓÅÄѶȺܴó¡£¹ØÓÚClusterÏÖÔڵĵ÷ÓÅÊÖ¶ÎÓÐÈçϼ¸ÖÖ£º

1¡¢ÓÉÓÚRabbitmq»ùÓÚerlangÔËÐУ¬¶øͨ¹ý±ÈÕÕ²âÊÔ£¬Æéá«°æÌìÐÔÄܲî±ð½Ï´ó£¬½¨ÒéÖ»¹ÜʹÓø߰汾£¬ÈçRabbitmqʹÓÃ3.8ÒÔÉÏ°æ±¾£¬erlang°æ±¾v22.3¼°ÒÔÉÏ£¨Æ¾Ö¤Rabbitmq°æ±¾ÏêϸѡÔñ£©¡£

2¡¢ÓÉÓÚClusterµÄÐÐÁоµÏñÊýÄ¿Ô½¶à£¬Ã¿ÌõÐÂÎÅ´¦Öóͷ£Ê±ÔÚÒ»ÖÂÐÔÉϵĺÄʱԽ³¤£¬Òò´Ë¿ÉÒÔƾ֤ÏÖÕæÏàÐÎïÔÌ­¾µÏñÐÐÁÐÊýÄ¿¡£

3¡¢ÈôRabbitmqÊÇÈÝÆ÷»¯×°Öã¬Îª×èÖ¹CPU×ÊÔ´±»ÇÀÕ¼£¬¿ÉÉèÖÃdocker²ÎÊý£¬·ÖÅɸøRabbitmq¸ü¶àµÄCPUʱ¼ä¡£

?

ÍƼöµÄµ÷ÓŲÎÊýÈçÏ£º


- collect_statistics_interval

ĬÈÏÇéÐÎÏ£¬Rabbitmq ServerĬÈÏÒÔ5sµÄ¾àÀëͳ¼ÆϵͳÐÅÏ¢£¬ÖÜÆÚÄÚpublish¡¢ delivery messageµÈËÙÂÊÐÅÏ¢»áÒÔ´ËΪÖÜÆÚͳ¼Æ¡£


- ÍƼöÉèÖÃ

ÔöÌí¸ÃÖµÄܹ»ïÔÌ­Rabbitmq Server ÍøÂç´ó×ÚµÄ״̬ÐÅÏ¢¶øµ¼ÖÂCPUʹÓÃÂÊÔöÌí£¬²ÎÊýµ¥Î»Îªms¡£

±à¼­RabbitMQµÄÉèÖÃÎļþrabbitmq.conf

collect_statistics_interval = 30000

ÖØÆôRabbitMQЧÀÍÉúЧ¡£


- cpu_share

Docker ÔÊÐíÓû§ÎªÃ¿¸öÈÝÆ÷ÉèÖÃÒ»¸öÊý×Ö£¬´ú±íÈÝÆ÷µÄ CPU share£¬Ä¬ÈÏÇéÐÎÏÂÿ¸öÈÝÆ÷µÄ share ÊÇ 1024¡£Òª×¢ÖØ£¬Õâ¸ö share ÊÇÏà¶ÔµÄ£¬×Ô¼º²¢²»¿É´ú±íÈκÎÈ·¶¨µÄÒâÒå¡£µ±Ö÷»úÉÏÓжà¸öÈÝÆ÷ÔËÐÐʱ£¬Ã¿¸öÈÝÆ÷Õ¼ÓÃµÄ CPU ʱ¼ä±ÈÀýΪËüµÄ share ÔÚ×ܶîÖеıÈÀý¡£Ö»ÓÐÔÚCPU×ÊÔ´Ö÷Ҫʱ£¬É趨µÄ×ÊÔ´±ÈÀý²Å¿ÉÒÔÕ¹ÏÖ³öÀ´£¬ÈôÊÇCPU×ÊÔ´¿ÕÏУ¬cpu_shareÖµµÍµÄdockerÒ²ÄÜ»ñÈ¡µ½±ÈÆÆÀýµÄCPU×ÊÔ´¡£


- ÍƼöÉèÖÃ

¿ØÖƽڵãÉÏ°²ÅÅopenstack ¸÷×é¼þµÄapi serverÒÔ¼°rabbitmq server£¬µ±¶Ôrabbitmq×ö²¢·¢²âÊÔʱ£¬¿ÉÒÔÊʵ±Ìá¸ß½ÚµãÉÏrabbitmq dockerµÄCPU shareÈÃÆä»ñµÃ¸ü¶àµÄCPU×ÊÔ´¡£ÈôÊÇÌõ¼þÖª×ãµÄÇéÐÎÏ£¬Rabbitmq Cluster Ó¦¸Ãµ¥¶À°²ÅÅÔÚЧÀÍÆ÷¼¯ÈºÉÏ£¬±³ÃæÆäËûЧÀÍÇÀÕ¼CPU×ÊÔ´¡£µ¥¶À°²ÅŵÄRabbitmq ClusterÓиü¸ßµÄ²¢·¢ÄÜÁ¦¡£


ÉèÖÃÒªÁ죺

docker update --cpu-shares 10240 rabbitmq-server


- ha-mode

Rabbitmq Cluster¾µÏñÐÐÁпÉÒÔÉèÖþµÏñÐÐÁÐÔÚ¶à½Úµã±¸·Ý£¬Ã¿¸öÐÐÁаüÀ¨Ò»¸ömaster½ÚµãºÍ¶à¸öslave½Úµã¡£ÏûºÄÕßÏûºÄµÄ²Ù×÷ÏÖÔÚmaster½ÚµãÉÏÍê³É£¬Ö®ºóÔÙslaveÉϾÙÐÐÏàͬµÄ²Ù×÷¡£Éú²úÕßÐû²¼µÄÐÂÎÅ»áͬ²½µ½ËùÓеĽڵã¡£ÆäËûµÄ²Ù×÷ͨ¹ýmasterÖÐת£¬master½«²Ù×÷×÷ÓÃÓÚslave¡£¾µÏñÐÐÁеÄÉèÖÃÕ½ÂÔ£º

ha-mode

ha-params

˵Ã÷

all


¼¯ÈºÖÐÿ¸ö½Úµã¶¼ÓоµÏñÐÐÁÐ

exactly

count

Ö¸¶¨¼¯ÈºÖоµÏñÐÐÁеĸöÊý

nodes

node names

ÔÚÖ¸¶¨µÄ½ÚµãÁбíÖÐÉèÖþµÏñÐÐÁÐ

ĬÈϵÄha-modeÊÇall£¬Èý½ÚµãµÄ¼¯Èº¾µÏñÐÐÁÐÔÚ3½Úµã±¸·Ý¡£½ç˵սÂÔµÄÖ¸Á

? ? rabbimqctl set_policy ¨Cp vhost

? ? pattern£ºÕýÔòÆ¥Å䣬½ç˵µÄpolicy»áƾ֤ÕýÔò±í´ïʽӦÓõ½ÏìÓ¦µÄ½»Á÷»ú»òÕßÐÐÁÐÉÏ¡£

? ? definition£ºÉèÖÃÕ½ÂԵIJÎÊý¡£


- ÍƼöÉèÖÃ

rabbitmqctl set_policy -p / ha-exactly '^' '{'ha-mode':'exactly', 'ha-params':2}'

¼¯ÈºÁ½ÐÐÁб¸·Ý±ÈÈýÐÐÁб¸·ÝÓ¦¶Ô¸ß²¢·¢µÄÄÜÁ¦¸üÇ¿£¬Ç°ÕßÖ§³ÖµÄ²¢·¢ÊýÊǺóÕßµÄ1.75±¶¡£


- CPU°ó¶¨

RabbitMQ ÔËÐÐÔÚerlang ÐéÄâ»úÖС£±¾ÎÄÇéÐÎÖ®ÖÐʹÓõÄerlang°æ±¾Ö§³ÖSMP£¬½ÓÄɶàµ÷ÀíÆ÷¶àÐÐÁеĻúÖÆ£¬¼´Æô¶¯erlangÐéÄâ»úʱ£¬Ä¬ÈÏ»áƾ֤ϵͳÂß¼­CPUºËÊýÆô¶¯ÏàͬÊýÄ¿µÄµ÷ÀíÆ÷£¨¿Éͨ¹ýÆô¶¯²ÎÊý+SÏÞÖÆ£©£¬Ã¿¸öµ÷ÀíÆ÷¶¼»á´Ó¸÷×ÔµÄÔËÐÐÐÐÁÐÖлñÈ¡ÔËÐÐÀú³Ì¡£µ«ÓÉÓÚOSµÄÏ̵߳÷Àí»úÖÆ£¬erlangµ÷ÀíÆ÷Ï̻߳áÔÚ¸÷ºËÖ®¼äǨá㣬Õâ»áµ¼ÖÂCache MissÔöÌí£¬Ó°ÏìÐÔÄÜ¡£¿ÉÒÔͨ¹ý²ÎÊý+sbt À´ÉèÖõ÷ÀíÆ÷ºÍÂß¼­ºË°ó¶¨¡£ErlangÖ§³Ö¶àÖÖ°ó¶¨Õ½ÂÔ£¬Ïê¼ûErlang˵Ã÷Îĵµ¡£


- ÍƼöÉèÖÃ

ĬÈÏÉèÖÃΪdb£¬°´numa nodeÂÖÁ÷°ó¶¨£¬Ö»¹ÜʹÓõ½ËùÓÐnode¡£µ«ÓÉÓÚµ÷ÀíÆ÷¶àʹÃüÐÐÁÐÖ®¼ä±£´æbalance»úÖÆ£¬Ê¹Ãü»áÔÚÐÐÁмäǨá㣬Òò´Ë£¬ÎªÁ˸üºÃµÄʹÓÃcache£¬ÔÚ±¾ÎIJâÊÔÇéÐÎÏÂÉÏÍƼö½«+stb ÉèÖÃΪnnts£¬¼´µ÷ÀíÆ÷Ḭ̈߳´numa node˳Ðò¾ÙÐа󶨡£


±à¼­RabbitMQµÄÉèÖÃÎļþrabbitmq-env.conf

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS='+sbt nnts'

ÖØÆôRabbitMQ ЧÀÍÉúЧ¡£


Óû§»¹¿Éƾ֤ÏÖÕæÏàÐνÓÄÉÆäËûErlangÐéÄâ»úµ÷ÓÅÒªÁìÅäºÏʹÓã¨ÈçÆôÓÃHipeµÈ£©£¬µ«ÏêϸErlangÐéÄâ»úµÄ²ÎÊýÏÈÈݺ͵÷ÓÅÊֶβ»ÔÚ±¾ÎÄÌÖÂÛ¹æÄ£¡£


- ÐÄÌø¼á³Ö

HearbeatÓÃÀ´¼ì²âͨѶµÄ¶Ô¶ËÊÇ·ñ±£»î¡£»ùÀ´Ô´ÀíÊǼì²â¶ÔÓ¦µÄsocketÁ´½ÓÉÏÊý¾ÝµÄÊÕ·¢ÊÇ·ñÕý³££¬ÈôÊÇÓÐÒ»¶Îʱ¼äûÓÐÊÕ·¢Êý¾Ý£¬ÔòÏò¶Ô¶Ë·¢ËÍÒ»¸öÐÄÌø¼ì²â°ü£¬ÈôÊÇÒ»¶Îʱ¼äÄÚûÓлØÓ¦ÔòÒÔΪÐÄÌø³¬Ê±¶Ô¶Ë¿ÉÄÜÒì³£crash¡£ÔÚ¸ºÔØÖØʱ×é¼þ¿ÉÄÜÎÞ·¨ÊµÊ±´¦Öóͷ£heartbeatÐÂÎŶøµ¼ÖÂRabbitmq ServerûÓÐÔÚ³¬Ê±Ê±¼äÄÚÊÕµ½ÐÄÌø¼ì²âÓ¦´ð¡£Rabbitmq ServerÒò×é¼þ³¬Ê±Î´Ó¦´ð¶ø¹Ø±ÕÅþÁ¬µ¼Ö¹ýʧ¡£Êʵ±ÔöÌí×é¼þºÍRabbitmq ServerµÄÐÄÌø³¬Ê±Ê±¼äÒÔ×èÖ¹¸Ã¹ýʧ¡£²Î¿¼CinderÕ½ÚÖÐÏà¹ØÏÈÈÝ¡£


±à¼­RabbitMQµÄÉèÖÃÎļþrabbitmq.conf

heartbeat = 180

ÖØÆôRabbitMQЧÀÍÉúЧ¡£


2.2 oslo_messaging


¸÷¹¦Ð§×é¼þºÍRabbitmqµÄÅþÁ¬¶¼ÊÇŲÓÃoslo_messagingÀ´ÊµÏֵģ¬¸Ã¿âÖÐÌṩµÄÈçϼ¸¸ö²ÎÊý¿ÉÓÃÓÚÓÅ»¯¸ß¸ºÔØʱRPC´¦Öóͷ£ÐÔÄÜ¡£


? ? rpc_conn_pool_size £ºRPCÅþÁ¬³ØµÄ¾Þϸ£¬Ä¬ÈÏÖµÊÇ30¡£

? ? executor_thread_pool_size £ºÖ´ÐÐRPC´¦Öóͷ£µÄÏ̻߳òЭ³ÌÊýÄ¿£¬Ä¬ÈÏÖµÊÇ64¡£

? ? rpc_response_timeout £ºRPCŲÓÃÏìÓ¦³¬Ê±Ê±¼ä£¬Ä¬ÈÏÖµÊÇ60s¡£


QG¹Î¹ÎÀÖ²âÊÔÖÐÇ°Á½¸ö²ÎÊýʹÓÃÁËĬÈÏÖµ£¬rpc_response_timeoutƾ֤¸÷×é¼þµÄÏÖʵ´¦Öóͷ£ÄÜÁ¦×öÁËÊʵ±ÔöÌí¡£


3. Nova


3.1 nova-api


- Workers

nova-apiÊÇÒ»¸öWSGI Server£¬Òà¿É½Ðapi server£¬ÆäÈÏÕæÎüÊÕÍⲿ·¢À´µÄRESTÇëÇó¡£API ServerÆô¶¯Ê±»áƾ֤ÉèÖÃÎļþ½¨ÉèÒ»¶¨ÃüÄ¿µÄworkerÏß³ÌÀ´ÎüÊÕÇëÇó£¬ÈôÊÇÏß³ÌÊýĿȱ·¦ÒÔÖª×ã²¢·¢Á¿£¬¾Í»á·ºÆðÅŶÓ£¬µ¼ÖÂÇëÇó´¦Öóͷ£Ê±¼äÔöÌí¡£Òò´Ë£¬ÖÎÀíÔ±¿ÉÒÔƾ֤ÏÖʵµÄÓªÒµÁ¿À´ÉèÖÃworkerÊýÄ¿¡£¹ØÓÚ¶ÀÍÌÒ»¸ö½ÚµãµÄapi server£¬workersͨ³£¿ÉÒÔÉèÖõ½µÈͬÓÚCPUºËÊý£¬µ«ÔÚ¶à¸öapi serverͬ°²ÅÅÓÚÒ»¸ö½Úµãʱ£¨ÈçÔÚ¿ØÖƽڵãÉÏ£©£¬ÎïÀíCPU×ÊÔ´ÓÐÏÞ£¬¾ÍÐèҪƾ֤ЧÀÍÆ÷´¦Öóͷ£ÄÜÁ¦ºÍ¸÷api serverµÄ¸ºÔØÇéÐε÷½â¸÷×ÔµÄworkersÊýÄ¿¡£


¹ØÓÚÈÝÆ÷·½·¨°²ÅŵÄapi server£¬»¹¿ÉÒÔ˼Á¿ÉèÖÃÈÝÆ÷ʹÓõÄCPUºÍÄÚ´æ×ÊÔ´µÄnumaÇ׺Í£¬Ö»¹Ü×öµ½¸÷ÈÝÆ÷µÄNUMAÇ׺ÍÍÑÀ룬´Ó¶øïÔÌ­CPU×ÊÔ´ÇÀÕ¼ÒÔ¼°¸üºÃµÄʹÓÃcache×ÊÔ´¡£¸ÃÒªÁì¶ÔÆäËûÒÔÈÝÆ÷·½·¨°²ÅŵÄЧÀÍҲͬÑùÊÊÓá£


ÔÚnova-apiµÄÉèÖÃÎļþÖÐÓм¸¸öÉèÖÃÏÈ磺osapi_compute_workersºÍmetadata_workers£¬¿ÉÒÔƾ֤½ÚµãЧÀÍÆ÷ÄÜÁ¦ºÍ¸ºÔØÇéÐξÙÐе÷½â¡£


ÆäËû×é¼þ£¬ÈçglanceµÈ¶¼ÓÐ×Ô¼ºµÄapi server£¬Óënova-apiÀàËÆ£¬¿ÉÒÔͨ¹ýÔöÌíworkersÊýÄ¿À´¸ÄÉÆÇëÇó´¦Öóͷ£ÐÔÄÜ¡£


- Patch£ºÈÈǨáãÐÔÄÜ

Nova ComputeÔÚÖ´ÐÐlive-migrate()ʱ£¬×ÓÏß³ÌÖ´ÐÐ_live_migration_operation()º¯ÊýºÍÖ÷Ïß³ÌÖ´ÐÐ_live_migration_monitor()º¯Êý¶¼»á»á¼ûInstance Fields¡£ÈôÊÇ´ËǰδÔø»á¼û¹ýInstance.fields£¬Ôò¿ÉÄÜͬʱŲÓÃnova/objects/instance.py:class Instance. obj_load_attr()º¯Êý£¬utils.temporary_mutation()º¯ÊýµÄÖØÈë»áµ¼ÖÂÖ´ÐкóContext.read_deleted¸³ÖµÎª¡°yes¡±¡£ºóÐøÔÚ_update_usage_from_instances()ʱ»áͳ¼Æ²»ÐëҪͳ¼ÆµÄÒѾ­É¾³ýµÄinstances£¬ÔöÌíºÄʱ¡£ Stein°æ±¾µÄOpenStack±£´æÉÏÊöÎó²î£¬ºóÐøµÄ°æ±¾ÓÉÓÚÆäËû²Ù×÷ÌáÇ°»ñÈ¡¹ýIntance.fields£¬´Ó¶ønova-compute²»ÐèÒªÔÚlive-migrateʱŲÓÃobj_load_attr()¡£BugÌá½»ÐÅÏ¢¼ûÁ´½Ó£ºhttps://bugs.launchpad.net/nova/+bug/1941819¡£¿ÉÒÔͨ¹ý²¹¶¡https://github.com/openstack/nova/commit/84db8b3f3d202a234221ed265ec00a7cf32999c9?ÔÚNova APIÖÐÌáÇ°»ñÈ¡Instance.filedsÒÔ×èÖ¹¸ÃBug¡£


²¹¶¡ÏêÇé°Ý¼û¸½¼þA1.1


3.2? Nova-scheduler & nova-conductor


ÕâÁ½¸öЧÀÍͬÑùÒ²ÓÐworkersÉèÖòÎÊýÀ´¿ØÖÆ´¦Öóͷ£Ï̵߳ÄÊýÄ¿£¬¿ÉÒÔƾ֤ÏÖʵӪҵÁ¿À´µ÷½â£¬QG¹Î¹ÎÀÖ²âÊÔÖÐÉèÖÃΪĬÈÏÖµ¼´¿ÉÖª×ã¡£Nova-conductorµÄworkersĬÈÏÖµÊÇCPUºËÊý¡£Nova-schedulerµÄĬÈÏÖµÊǵ±Ê¹ÓÃfilter-schedulerʱÊÇCPUºËÊý£¬Ê¹ÓÃÆäËûSchedulerʱĬÈÏÖµÊÇ1¡£


3.3 nova-computer


- vcpu_pin_set

ÏÞÖÆGuest¿ÉÒÔʹÓÃcompute nodeÉϵÄpCPUsµÄ¹æÄ££¬¸øHostÊʵ±ÁôÏÂһЩCPUÒÔ°ü¹ÜÕý³£ÔË×÷¡£½â¾öÒòÐéÄâ»ú¸ß¸ºÔØÇéÐÎÏÂÕùÇÀCPU×ÊÔ´µ¼ÖÂHostÐÔÄÜȱ·¦µÄÎÊÌâ¡£


- ÍƼöÉèÖÃ

ÔÚÿ¸önuma nodeÉ϶¼Ô¤ÁôÒ»¸öÎïÀíCPUºË¹©HostʹÓá£ÒÔÁ½numa nodes ƽ̨ΪÀý£¬¿ÉÑ¡ÔñÔ¤Áôcpu0ºÍcpu15¹©HostʹÓã¬ÉèÖÃÒªÁìÈçÏ£º

±à¼­nova-computeµÄÉèÖÃÎļþnova.conf

vcpu_pin_set = 1-14

ÖØÆônova_computeЧÀÍÉúЧ¡£


- reserved_host_memory_mb

¸Ã²ÎÊýÓÃÓÚÔÚÅÌËã½ÚµãÉÏÔ¤ÁôÄÚ´æ¸øHostϵͳʹÓã¬×èÖ¹ÐéÄâÖ÷»úÕ¼Óõô¹ý¶àÄڴ棬µ¼ÖÂHostϵͳÉϵÄʹÃüÎÞ·¨Õý³£ÔËÐС£


- ÍƼöÉèÖÃ

ÉèÖÃÖµÐëƾ֤ÏÖʵÄÚ´æ×ÜÁ¿¡¢HostϵͳÖÐÔËÐеÄtasksÒÔ¼°Ô¤ÆÚÐéÄâʱ»úÕ¼ÓõÄ×î´óÄÚ´æÁ¿À´¾öÒ飬½¨ÒéÖÁÉÙÔ¤Áô1024MB¹©ÏµÍ³Ê¹Óá£

±à¼­nova-computeµÄÉèÖÃÎļþnova.conf

reserved_host_memory_mb=4096? #µ¥Î»MB

ÖØÆônova_computeЧÀÍÉúЧ¡£


- cpu_allocation_ratio

ÉèÖÃvCPUµÄ¿É³¬Åä±ÈÀý¡£


- ÍƼöÉèÖÃ

ƾ֤ZHAOXIN CPUµÄÐÔÄÜ£¬×ÀÃæÔÆϵͳÖУ¬µ¥¸öpCPU¿ÉÒÔÐéÄâ2¸övCPU¡£

±à¼­nova-computeµÄÉèÖÃÎļþnova.conf

[DEFAULT]

cpu_allocation_ratio = 2

ÖØÆônova_computeЧÀÍÉúЧ¡£


- block_device_allocate_retries

½¨ÉèÓÐblock deviceµÄÐéÄâ»úʱ£¬ÐèÒª´Óblank¡¢image»òÕßsnaphot½¨Éèvolume¡£ÔÚvolume±»attachµ½ÐéÄâ»ú֮ǰ£¬Æä״̬±ØÐèÊÇ¡°available¡±¡£block_device_allocate_retriesÖ¸¶¨nova¼ì²évolume״̬ÊÇ·ñ¡°available¡±µÄ´ÎÊý¡£Ïà¹Ø²ÎÊýÓÐblock_device_allocate_retries_interval£¬Ö¸¶¨¼ì²é״̬µÄÅÌÎʾàÀ룬ĬÈÏÖµ3£¬µ¥Î»s¡£


- ÍƼöÉèÖÃ

ĬÈÏÖµÊÇ60´Î¡£µ±cinder¸ºÔؽÏÖØʱ£¬60´ÎÅÌÎÊÖ®ºó¿ÉÄÜvolumeµÄ״̬²»ÊÇ¡°available¡±£¬Êʵ±ÔöÌíÅÌÎÊ´ÎÊý£¬×èÖ¹ÐéÄâ»ú½¨Éèʧ°Ü¡£


±à¼­nova-computeµÄÉèÖÃÎļþnova.conf

[DEFAULT]

block_device_allocate_retries = 150

ÖØÆônova_computeЧÀÍÉúЧ¡£


- vif_plugging_timeout

nova-computeÆÚ´ýNeutron VIF plugging event message arrivalµÄ³¬Ê±Ê±¼ä¡£


- ÍƼöÉèÖÃ

ĬÈÏÖµ300£¬µ¥Î»s¡£µ±½¨ÉèÐéÄâ»ú²¢·¢Êý¸ßʱ£¬¿ÉÄÜÎÞ·¨ÔÚ300sÄÚÊÕµ½¸Ãevent¡£QG¹Î¹ÎÀÖ×ÀÃæÔÆϵͳ²âÊÔ200²¢·¢½¨ÉèÐéÄâ»úʱ£¬ºÄʱԼ360s£¬¸ÃÖµ¿ÉÒÔƾ֤ϵͳµÄ×î´ó²¢·¢ÊýÊʵ±µ÷½â¡£


±à¼­nova-computeµÄÉèÖÃÎļþnova.conf

[DEFAULT]

vif_plugging_timeout = 500

ÖØÆônova_computeЧÀÍÉúЧ¡£


- Patch£ºÈÈǨáãÐÔÄÜ

¸Ã²¹¶¡Íê³ÉÁËÁ½¸ö¹¦Ð§£ºÈ¥µôǨáãÀú³ÌÖв»ÐëÒªµÄget_volume_connect() º¯ÊýŲÓã¬ÒÔ¼°ïÔÌ­²»ÒªµÄNeutron»á¼û¡£¸Ã²¹¶¡Äܹ»ÈÃÈÈǨáã¸ü¸ßЧ£¬ÈÈǨáãµÄÎÞЧÀÍʱ¼ä¸ü¶Ì¡£²¹¶¡µØµã£º

https://review.opendev.org/c/openstack/nova/+/795027

https://opendev.org/openstack/nova/commit/6488a5dfb293831a448596e2084f484dd0bfa916

²¹¶¡ÏêÇé°Ý¼û¸½¼þA1.2


4. Cinder


4.1 cinder-api


- Workers

°Ý¼ûnova-api¡£²î±ðµÄÊÇStein°æÖÐcinder-api½ÓÄÉhttpd°²ÅÅ£¬Òò´Ë£¬Æä³ýÁË¿ÉÒÔµ÷ÓÅcinderÖ§³ÖµÄÖÖÖÖworkers£¨Èçosapi_volume_workers£©£¬»¹¿ÉÒÔµ÷ÓÅcinder-wsgi.confÖеÄprocessesºÍthreads¡£ÀàËƵÄ×é¼þÉÐÓÐkeystoneºÍhorizonµÈ¡£


±à¼­cinder-wsgi.conf

WSGIDaemonProcess cinder-api processes=12 threads=3 user=cinder group=cinder display-name=%{GROUP} python-path=/var/lib/kolla/venv/lib/python2.7/site-packages

¡­¡­

ÖØÆôcinder-apiЧÀÍÉúЧ¡£


- rpc_response_timeout


Cinder-apiÆÚ´ýRPCÐÂÎÅ·µ»ØµÄ³¬Ê±Ê±¼ä


- ÍƼöÉèÖÃ

ĬÈÏÖµ 60£¬µ¥Î»s¡£Ôڸ߲¢·¢µÄattach_volumeʱ£¬cinder-volumeÏìÓ¦cinder-apiµÄʱ¼ä½Ï³¤¡£ÈôÊDZ¨¸ærpc timeoutµÄ¹ýʧ£¬¿ÉÒÔÊʵ±µ÷´ó¸ÃÖµ¡£


±à¼­cinder-volumeµÄÉèÖÃÎļþcinder.conf

[DEFAULT]

rpc_response_timeout = 600

ÖØÆôcinder-apiЧÀÍÉúЧ¡£


4.2 cinder-volume


- ÐÄÌø¼á³Ö

ºÍRabbitmqÖ®¼ä

²Î¿¼RabbitMQÕÂÖеġ°ÐÄÌø¼á³Ö¡±Ð¡½Ú¡£


CinderµÄheartbeat_timeout_thresholdÓÃÀ´ÉèÖÃÐÄÌø³¬Ê±Ê±¼ä£¬»áÒÔ1/2ÐÄÌø³¬Ê±Ê±¼äΪ¾àÀë·¢ËÍÐÄÌø¼ì²âÐźÅ¡£


- ÍƼöÉèÖÃ

cinder-volume heartbeat_timeout_thresholdĬÈÏֵΪ60£¬µ¥Î»Îªs£¬ÔÚ¸ºÔØÖØʱ¿ÉÄÜÎÞ·¨ÔÚʵʱ´¦Öóͷ£heartbeatÐÂÎŶøµ¼ÖÂRabbitmq ServerûÓÐÔÚ³¬Ê±Ê±¼äÄÚÊÕµ½ÐÄÌø¼ì²âÓ¦´ð¡£Rabbitmq ServerÒòCinder-volume³¬Ê±Î´Ó¦´ð¶ø¹Ø±ÕÅþÁ¬£¬½ø¶øµ¼ÖÂһϵÁйýʧ¡£Êʵ±ÔöÌíCinder-volumeºÍRabbitmq ServerµÄÐÄÌø³¬Ê±Ê±¼äÒÔ×èÖ¹¸Ã¹ýʧ£¬²»½¨Òé½ûרÐÄÌø¼ì²â»úÖÆ£¨heartbeat=0£©¡£


±à¼­cinder-volumeµÄÉèÖÃÎļþcinder.conf

[oslo_messaging_rabbit]

heartbeat_timeout_threshold = 180

ÖØÆôcinder-volumeЧÀÍÉúЧ¡£


ЧÀÍÖ®¼ä

OpenStackÊÇÒ»¸öÂþÑÜʽϵͳ£¬ÓÉÔËÐÐÔÚ²î±ðÖ÷»úÉϵĸ÷¸öЧÀÍ×é³ÉÀ´ÅäºÏÍê³É¸÷ÏîÊÂÇ顣ÿ¸öЧÀͶ¼»á׼ʱÏòÊý¾Ý¿âÖиüÐÂ×Ô¼ºµÄupdate time£¬Ð§Àͼä¿Éͨ¹ýÅÌÎʶԷ½µÄupdate timeÊÇ·ñÁè¼ÝÉèÖõÄservice_down_timeÀ´ÅжÏЧÀÍÊÇ·ñÔÚÏß¡£ÕâÒ²¿ÉÒÔ¿´×÷ÊÇÒ»ÖÖÐÄÌø»úÖÆ¡£


Ôڸ߸ºÔØʱ£¬Êý¾Ý¿â»á¼û¿ÉÄÜÑÓ³ÙÔöÌí£¬Í¬Ê±ÔËÐÐÉϱ¨µÄÖÜÆÚʹÃü»áÒòCPU×ÊÔ´±»Õ¼Óõ¼ÖÂÑÓ³ÙÉϱ¨£¬ÕâЩ¶¼ÓпÉÄÜÒý·¢Îó±¨service down¡£


- ÍƼöÉèÖÃ

report_interval£º×´Ì¬±¨¸æ¾àÀ룬¼´ÐÄÌø¾àÀ룬ĬÈÏ10£¬µ¥Î»s¡£

service_down_time£º¾àÀëÉÏÒ»´ÎÐÄÌøµÄ×ʱ¼ä£¬Ä¬ÈÏ60£¬µ¥Î»s¡£Áè¼ÝÕâ¸öʱ¼äûÓÐÐÄÌøÔòÒÔΪЧÀÍDown¡£

report-intervalÒ»¶¨ÒªÐ¡ÓÚservice_down_time¡£Êʵ±ÔöÌíservice_down_time£¬×èÖ¹cinder-volumeµÄÖÜÆÚÐÔʹÃüÕ¼ÓÃcpuµ¼ÖÂûÓÐʵʱ±¨¸æ״̬¶ø±»ÎóÒÔΪDown¡£

±à¼­cinder-volumeµÄÉèÖÃÎļþcinder.conf

service_down_time = 120

ÖØÆôcinder_volumeЧÀÍÉúЧ¡£


- rbd_exclusive_cinder_pool


OpenStack OcataÒýÈëÁ˲ÎÊýrbd_exclusive_cinder_pool£¬ÈôÊÇRBD poolÊÇCinder¶ÀÍÌ£¬Ôò¿ÉÒÔÉèÖÃrbd_exclusive_cinder_pool=true¡£CinderÓÃÅÌÎÊÊý¾Ý¿âµÄ·½·¨È¡´úÂÖѯºó¶ËËùÓÐvolumesµÄ·½·¨»ñÈ¡provisioned size£¬Õâ»áÏÔ×ÅïÔÌ­ÅÌÎÊʱ¼ä£¬Í¬Ê±¼õÇáCeph ¼¯ÈººÍCinder-volume ЧÀ͵ĸºÔØ¡£


- ÍƼöÉèÖÃ

±à¼­cinder-volumeµÄÉèÖÃÎļþcinder.conf

[DEFAULT]

Enable_backends =rbd-1

[rbd-1]

rbd_exclusive_cinder_pool = true

ÖØÆôcinder-volumeЧÀÍÉúЧ¡£


- image_volume_cache_enabled


´ÓLiberty°æ±¾×îÏÈ£¬CinderÄܹ»Ê¹ÓÃimage volume cahe£¬Äܹ»Ìá¸ß´Óimage½¨ÉèvolumeµÄÐÔÄÜ¡£´ÓimageµÚÒ»´Î½¨ÉèvolumeµÄͬʱ»á½¨ÉèÊôÓÚ¿ì´æ´¢Internal TenantµÄcached image-volume ¡£ºóÐø´Ó¸Ãimage½¨Éèvolumeʱ´Ócached image-volume ¿Ë¡£¬²»ÐèÒª½«image ÏÂÔص½ÍâµØÔÙ´«Èëvolume¡£


- ÍƼöÉèÖÃ

cinder_internal_tenant_project_id£ºÖ¸¶¨OpenStackµÄÏîÄ¿¡°service¡±µÄID

cinder_internal_tenant_user_id£ºÖ¸¶¨OpenStackµÄÓû§¡°cinder¡±µÄID

image_volume_cache_max_size_gb£ºÖ¸¶¨cached image-volumeµÄ×î´ósize£¬ÉèÖÃΪ0£¬¼´²î³ØÆäÏÞÖÆ¡£

image_volume_cache_max_count£ºÖ¸¶¨cached image-volumeµÄ×î´óÊýÄ¿£¬ÉèÖÃΪ0£¬¼´²î³ØÆäÏÞÖÆ¡£

±à¼­cinder-volumeµÄÉèÖÃÎļþcinder.conf

[DEFAULT]

cinder_internal_tenant_project_id = c4076a45bcac411bacf20eb4fecb50e0?

cinder_internal_tenant_user_id = 4fe8e33010fd4263be493c1c9681bec8?

[backend_defaults]

image_volume_cache_enabled=True

image_volume_cache_max_size_gb = 0

image_volume_cache_max_count = 0

ÖØÆôcinder-volumeЧÀÍÉúЧ¡£


5. Neutron


5.1 Neutron Service


neutron-serviceÊÇneutron×é¼þµÄapi server£¬ÆäÉèÖÃÓÅ»¯²Î¿¼nova-apiÖеÄÏÈÈÝ£¬¿Éµ÷½â²ÎÊýÓÐapi_workersºÍmetadata_workers¡£


- rpc_workers

ÔÚneutronµÄÉè¼Æ¼Ü¹¹ÉÏ£¬½¹µãЧÀͺ͸÷pluginµÄ´¦Öóͷ£ÊÇÏÈÖ÷Àú³Ìfork×ÓÀú³Ì£¬ÔÙÔÚ×ÓÀú³ÌÖн¨ÉèЭ³ÌÀ´ÔËÐд¦Öóͷ£³ÌÐò£¬´Ó¶øʵÏÖ¿ÉʹÓõ½¶àºËµÄ²¢·¢´¦Öóͷ£¡£rpc_workersÊÇÓÃÀ´¿ØÖÆΪRPC´¦Öóͷ£½¨ÉèµÄÀú³ÌÊýÄ¿£¬Ä¬ÈÏÖµÊÇapi_workersµÄÒ»°ë¡£ÓÉÓÚQG¹Î¹ÎÀÖϵͳÊÇ»ùÓÚÈÝÆ÷°²ÅÅ£¬Òò´Ë¸ÃֵʹÓÃĬÈÏÖµ¼´¿É¡£


5.2 Neutron DHCP Agent


ÕâÀïµÄÁ½¸ö²¹¶¡Ö÷ÒªÓ°ÏìÍøÂç½ÚµãÉϵÄneutronЧÀÍ¡£


- ¸ÄÉÆNetwork PortÖÎÀíЧÂÊ


Patch1

Neutron DHCP agentÖÐÓÃPyroute2 µÄ¡°ip route¡±ÏÂÁîÌæ»»oslo.rootwrap¿âÖиÃlinuxÏÂÁî¡£¸Ã²¹¶¡ÈÃNeutron DHCP agent½¨ÉèºÍɾ³ýportʱԽ·¢¸ßЧ¡£²¹¶¡µØµã£º

https://opendev.org/openstack/neutron/commit/06997136097152ea67611ec56b345e5867184df5

²¹¶¡ÏêÇé°Ý¼û¸½¼þA1.3¡£


Patch2

Neutron DHCP agentÖÐÓÃoslo.privsep¿âµÄ¡°dhcp_release¡±ÏÂÁîÌæ»»oslo.rootwrap¿â¸Ãlinux ÏÂÁî¡£¸Ã²¹¶¡ÈÃNeutron DHCP agent½¨ÉèºÍɾ³ýportʱԽ·¢¸ßЧ¡£²¹¶¡µØµã£º

https://opendev.org/openstack/neutron/commit/e332054d63cfc6a2f8f65b1b9de192ae0df9ebb3

https://opendev.org/openstack/neutron/commit/2864957ca53a346418f89cc945bba5cdcf204799

²¹¶¡ÏêÇé°Ý¼û¸½¼þA1.4¡£


5.3 Neuton OpenvSwitch Agent


- ¸ÄÉÆNetwork Port´¦Öóͷ£Ð§ÂÊ


polling_interval

Neutron L2 AgentÈôÊÇÉèÖõÄÊÇopenvswitch agent£¬neutron-openvswitch-agentÆô¶¯ºó»áÔËÐÐÒ»¸öRPCÑ­»·Ê¹ÃüÀ´´¦Öóͷ£¶Ë¿ÚÌí¼Ó¡¢É¾³ý¡¢Ð޸ġ£Í¨¹ýÉèÖÃÏîpolling_intervalÖ¸¶¨RPCÑ­»·Ö´ÐеľàÀë¡£


- ÍƼöÉèÖÃ

ĬÈÏÖµÊÇ2£¬µ¥Î»s¡£ïÔÌ­¸ÃÖµ¿ÉÒÔʹµÃ¶Ë¿Ú״̬¸üиü¿ì£¬ÌØÊâÊÇÈÈǨáãÀú³ÌÖУ¬ïÔÌ­¸ÃÖµ¿ÉÒÔïÔ̭Ǩáãʱ¼ä¡£µ«ÈôÊÇÉèÖÃΪ0»áµ¼ÖÂneutron-openvswitch-agentÕ¼Óùý¶àµÄCPU×ÊÔ´¡£

±à¼­neutron-openvswitch-agentµÄÉèÖÃÎļþml2_conf.ini

[agent]

polling_interval = 1

ÖØÆôÅÌËã½ÚµãµÄneutron-openvswitch-agentЧÀÍÉúЧ¡£


Patch

Neutron openvswitch agentÖÐÓÃoslo.privsep¿âÌæ»»oslo.rootwrap¿âµÄ¡°iptables¡±ºÍ¡°ipset¡±ÏÂÁî¡£¸Ã²¹¶¡ÄÜÈÃNeutron openvswitch agent´¦Öóͷ£network portʱԽ·¢¸ßЧ¡£²¹¶¡µØµã£º

https://opendev.org/openstack/neutron/commit/6c75316ca0a7ee2f6513bb6bc0797678ef419d24

https://opendev.org/openstack/neutron/commit/5a419cbc84e26b4a3b1d0dbe5166c1ab83cc825b

²¹¶¡ÏêÇé°Ý¼û¸½¼þA1.5¡£


5.4 ÈÈǨáãDown Timeʱ¼äÓÅ»¯


openstack stein°æ±¾ÔÚÈÈǨáãµÄ²âÊÔÖУ¬Ðé»úǨá㵽ĿµÄÖ÷»úºó£¬ÍøÂç²»¿Éʵʱpingͨ£¬±£´æ½ÏÁ¿ÏÔ×ŵÄÑÓʱÕ÷Ïó¡£Ôµ¹ÊÔ­ÓÉÊÇÐé»úǨáãÀֳɺó»áÁ¬Ã¦·¢ËÍRARP¹ã²¥£¬¶ø´ËʱÐé»úÍø¿¨¶ÔÓ¦µÄport»¹Ã»ÕæÕýup¡£BugÐÅÏ¢£º

https://bugs.launchpad.net/neutron/+bug/1901707

https://bugs.launchpad.net/neutron/+bug/1815989

²¹¶¡ÏêÇé°Ý¼û¸½Â¼B1.1--B1.7, Éæ¼°neutronºÍnovaÄ£¿é£º

https://review.opendev.org/c/openstack/neutron/+/790702

https://review.opendev.org/c/openstack/nova/+/767368


5.5 ÍøÂçÐÔÄÜÓÅ»¯


ÍøÂçΪÁË»ñµÃÎȹ̵ĸßÐÔÄÜ£¬ÔÚ°²ÅÅÐé»úʱ£¬Íø¿¨Ó²ÖÐÖ¹ºÍ¶ÔÓ¦Ðé»ú£¬×îºÃÏÞÖÆÔÚλÓÚͳһClusterµÄCPUÉÏ£¬ÕâÑù¿ÉÒÔ×èÖ¹²»ÐëÒªµÄcache miss£¬½ø¶øÌáÉýÍøÂçµÄÎȹÌÐÔºÍÐÔÄÜ¡£


5.6 VXLAN ÐÔÄÜÓÅ»¯


Ö÷Á÷ËíµÀÍøÂçÆÕ±é»ùÓÚUDPЭÒéʵÏÖ£¬ÀýÈçVXLAN£¬µ±UDPУÑéºÍ×Ö¶ÎΪÁãʱ£¬»áµ¼ÖÂÎüÊÕ¶ËÔÚ´¦Öóͷ£VXLAN±¨ÎÄʱ²»¿Éʵʱ¾ÙÐÐGRO(generic receive offload)´¦Öóͷ£,½ø¶øÑÏÖØÓ°ÏìÍøÂçÐÔÄÜ¡£¸ÃÎÊÌâÉçÇøÒѾ­ÐÞÕý£¬ÏêϸÐÅÏ¢¿ÉÒ԰ݼûÏÂÃæÁ´½Ó£º

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=89e5c58fc1e2857ccdaae506fb8bc5fed57ee063

´òÉϸò¹¶¡ºó£¬ÍòÕ×Íø¿¨ÇéÐÎÏ£¬Í¬ÑùµÄVXLAN iperf3²âÊÔ£¬Ð§¹û¿ÉÒÔÌáÉý2±¶ÒÔÉÏ¡£

²¹¶¡ÏêÇé°Ý¼û¸½Â¼C1.1


6.?Keystone


- ²¢·¢Àú³ÌÊý

WSGIDaemonProcessÊØ»¤Àú³ÌµÄÉèÖòÎÊý¡£Processes½ç˵ÁËÊØ»¤Àú³ÌÆô¶¯µÄÀú³ÌÊý¡£


- ÍƼöÉèÖÃ

ĬÈÏֵΪ1¡£µ±keystoneѹÁ¦½Ï´óʱ£¬1¸öWSGIÀú³ÌÎÞ·¨´¦Öóͷ£½Ï´óµÄ²¢·¢Êý£¬Êʵ±ÔöÌíprocessesµÄÖµ£¬´óÓÚCPU cores numberµÄÒâÒå²»´ó¡£

±à¼­keystoneµÄÉèÖÃÎļþwsgi-keystone.conf

WSGIDaemonProcess keystone-public processes=12 threads=1 user=keystone group=keystone display-name=%{GROUP} python-path=/var/lib/kolla/venv/lib/python2.7/site-packages

¡­¡­

?

WSGIDaemonProcess keystone-admin processes=12 threads=1 user=keystone group=keystone display-name=%{GROUP} python-path=/var/lib/kolla/venv/lib/python2.7/site-packages

¡­¡­


ÖØÆôkeystoneЧÀÍÉúЧ¡£


7.?Haproxy


- TimeoutÉèÖÃ

1¡¢timeout http-request£ºHTTPÇëÇóµÄ×î´ó³¬Ê±Ê±¼ä

2¡¢timeout queue£ºµ±server¶ËÇëÇóÊýÄ¿µÖ´ïÁËmaxconn£¬Ðµ½µÄconnections»á±»Ìí¼Óµ½Ö¸¶¨µÄqueue¡£µ±requestsÔÚqueueÉÏÆÚ´ýÁè¼Ýtimeout queueʱ£¬request±»ÒÔΪ²»±»Ð§ÀͶøÑïÆú£¬·µ»Ø503 error¸øclient¶Ë¡£

3¡¢timeout connect £ºconnectionÅþÁ¬ÉϺó¶ËЧÀÍÆ÷µÄ³¬Ê±Ê±¼ä¡£

4¡¢timeout client £ºclient¶Ë·¢ËÍÊý¾Ý»òÕßÓ¦´ðʱ£¬¿Í»§¶Ë×î´óµÄ·Ç»îԾʱ¼ä

5¡¢timeout server£ºserver¶Ë×î´óµÄ·Ç»îԾʱ¼ä


- ÍƼöÉèÖÃ

±à¼­haproxyµÄÉèÖÃÎļþhaproxy.cfg£¬µ¥Î»sÌåÏÖÃ룬µ¥Î»mÌåÏÖ·Ö¡£

defaults

timeout http-request 100s

timeout queue 4m

timeout connect 100s

timeout client 10m

timeout server 10m¡£

ÖØÆôhaproxyЧÀÍÉúЧ¡£


- ×î´óÅþÁ¬Êý

Haproxy¿ÉÒÔÉèÖÃÈ«¾ÖµÄmaxconn½ç˵Haproxyͬʱ×î´óµÄÅþÁ¬Êý£¬Ò²¿ÉÒÔΪºó¶ËЧÀÍÉèÖÃmaxconn½ç˵¸ÃЧÀ͵Ä×î´óÅþÁ¬Êý£¬¿ÉÒÔΪǰ¶ËÉèÖÃmaxconn½ç˵´Ë¶Ë¿ÚµÄ×î´óÅþÁ¬Êý¡£ÏµÍ³µÄulimit -nµÄÖµÒ»¶¨Òª´óÓÚmaxconn¡£


- ÍƼöÉèÖÃ

È«¾ÖµÄmaxconnĬÈÏֵΪ4000£¬Ó¦ÓÃÀú³ÌÖÐÔÚHaproxyµÄ¿ØÖÆ̨ÊӲ쵽ȫ¾ÖÅþÁ¬Êý²»·ó£¬½«ÆäÔöÌíµ½40000¡£

±à¼­HaproxyµÄÉèÖÃÎļþhaproxy.cfg

global

maxconn 40000

ÖØÆôhaproxyЧÀÍÉúЧ¡£


- ´¦Öóͷ£Ïß³ÌÊý

ÉèÖÃhaproxyµÄÈÏÕæƽºâ²¢·¢Àú³ÌÊý£¬OpenStack SteinµÄHaproxy °æ±¾Îª1.5.18¡£¸Ã²ÎÊýÔÚhaproxy 2.5 °æ±¾ÖÐÒÑÒƳý£¬ÓÉnbthread²ÎÊýÖ¸¶¨Ïß³ÌÊýÄ¿¡£


- ÍƼöÉèÖÃ

ÓÉÓÚHaproxyµÄ¸ºÔؽÏÖØ£¬ÍƼöÊʵ±Ôö´ó¸Ã²ÎÊý¡£

±à¼­HaproxyµÄÉèÖÃÎļþhaproxy.cfg

global

nbproc 4

ÖØÆôhaproxyЧÀÍÉúЧ¡£


- ȨÖØ

Haproxyºó¶ËÉèÖòÎÊýweight¿ÉÒÔÉèÖÃserverµÄȨÖØ£¬È¡Öµ¹æÄ£0-256£¬È¨ÖØÔ½´ó£¬·Ö¸øÕâ¸öserverµÄÇëÇó¾ÍÔ½¶à¡£weightΪ0µÄserver½«²»»á±»·ÖÅÉÈκÎеÄÅþÁ¬¡£ËùÓÐserverµÄĬÈÏֵΪ1¡£


- ÍƼöÉèÖÃ

µ±¶à¸öºó¶ËµÄÖ÷»úѹÁ¦·×ÆçÖÂʱ£¬¿ÉÒÔ½«Ñ¹Á¦´óµÄÖ÷»úÉϵÄserverµÄȨÖØÊʵ±ïÔÌ­£¬´Ó¶øʹ¸÷Ö÷»ú¸ºÔØƽºâ¡£


ÒÔ3 ¿ØÖƽڵãµÄС¼¯ÈºÎªÀý£º²âÊÔÀú³ÌÖУ¬controller03ÉÏÕûÌåcpuʹÓÃÂʽϸߴï95%+£¬ÆäËûÁ½¸ö¿ØÖƽڵãcpuʹÓÃÂÊÔ¼ÔÚ70%£¬¸÷¸ö¿ØÖƽڵãkeystoneµÄ cpuʹÓÃÂʾù½Ï¸ß¡£ïÔÌ­controller03ÉÏkeystone serverµÄȨÖØ£¬´Ó¶øïÔÌ­controller03µÄcpuѹÁ¦¡£


±à¼­HaproxyµÄÉèÖÃÎļþkeystone.cfg

listen keystone_external

??? mode http

??? http-request del-header X-Forwarded-Proto

??? option httplog

??? option forwardfor

??? http-request set-header X-Forwarded-Proto https if { ssl_fc }

??? bind?haproxy-ip-addr:5000

??? maxconn 5000

??? server controller01?server-ip-addr:5000 check inter 2000 rise 2 fall 5 maxconn 3000 weight 10

??? server controller02?server-ip-addr:5000 check inter 2000 rise 2 fall 5 maxconn 3000 weight 10

??? server controller03?server-ip-addr:5000 check inter 2000 rise 2 fall 5 maxconn 3000 weight 9

ÖØÆôhaproxyЧÀÍÉúЧ¡£

~~~~~~~~~~~~~~~~~~

¸½Â¼

A1.1

diff -ruN nova-bak/api/openstack/compute/migrate_server.py nova/api/openstack/compute/migrate_server.py

--- nova-bak/api/openstack/compute/migrate_server.py??? 2021-09-10 11:20:15.774990677 +0800

+++ nova/api/openstack/compute/migrate_server.py 2021-09-10 11:23:22.239098421 +0800

@@ -157,7 +157,9 @@

?????????????????????????????? 'conductor during pre-live-migration checks '

?????????????????????????????? ''%(ex)s'', {'ex': ex})

???????????? else:

-??????????????? raise exc.HTTPBadRequest(explanation=ex.format_message())

+?????????????? raise exc.HTTPBadRequest(explanation=ex.format_message())

+??????? except exception.OperationNotSupportedForSEV as e:

+??????????? raise exc.HTTPConflict(explanation=e.format_message())

???????? except exception.InstanceIsLocked as e:

???????????? raise exc.HTTPConflict(explanation=e.format_message())

???????? except exception.ComputeHostNotFound as e:

diff -ruN nova-bak/api/openstack/compute/suspend_server.py nova/api/openstack/compute/suspend_server.py

--- nova-bak/api/openstack/compute/suspend_server.py?? 2021-09-10 11:25:03.847439106 +0800

+++ nova/api/openstack/compute/suspend_server.py 2021-09-10 11:27:09.958950964 +0800

@@ -40,7 +40,8 @@

???????????? self.compute_api.suspend(context, server)

???????? except exception.InstanceUnknownCell as e:

???????????? raise exc.HTTPNotFound(explanation=e.format_message())

-??????? except exception.InstanceIsLocked as e:

+??????? except (exception.OperationNotSupportedForSEV,

+??????????????? exception.InstanceIsLocked) as e:

???????????? raise exc.HTTPConflict(explanation=e.format_message())

???????? except exception.InstanceInvalidState as state_error:

???????????? common.raise_http_conflict_for_instance_invalid_state(state_error,

diff -ruN nova-bak/compute/api.py nova/compute/api.py

--- nova-bak/compute/api.py?? 2021-09-10 11:31:55.278077457 +0800

+++ nova/compute/api.py 2021-09-10 15:32:28.131175652 +0800

@@ -215,6 +215,23 @@

???????? return fn(self, context, instance, *args, **kwargs)

???? return _wrapped

?

+def reject_sev_instances(operation):

+??? '''Decorator.? Raise OperationNotSupportedForSEV if instance has SEV

+??? enabled.

+??? '''

+

+??? def outer(f):

+??????? @six.wraps(f)

+??????? def inner(self, context, instance, *args, **kw):

+??????????? if hardware.get_mem_encryption_constraint(instance.flavor,

+????????????????????????????????????????????????????? instance.image_meta):

+??????????????? raise exception.OperationNotSupportedForSEV(

+??????????????????? instance_uuid=instance.uuid,

+??????????????????? operation=operation)

+??????????? return f(self, context, instance, *args, **kw)

+??????? return inner

+??? return outer

+

?

?def _diff_dict(orig, new):

???? '''Return a dict describing how to change orig to new.? The keys

@@ -690,6 +707,9 @@

???????? '''

? ???????image_meta = _get_image_meta_obj(image)

?

+??????? API._validate_flavor_image_mem_encryption(instance_type, image_meta)

+?????

+

???????? # Only validate values of flavor/image so the return results of

???????? # following 'get' functions are not used.

???????? hardware.get_number_of_serial_ports(instance_type, image_meta)

@@ -701,6 +721,19 @@

???????? if validate_pci:

???????????? pci_request.get_pci_requests_from_flavor(instance_type)

?

+??? @staticmethod

+??? def _validate_flavor_image_mem_encryption(instance_type, image):

+??????? '''Validate that the flavor and image don't make contradictory

+??????? requests regarding memory encryption.

+??????? :param instance_type: Flavor object

+??????? :param image: an ImageMeta object

+??????? :raises: nova.exception.FlavorImageConflict

+??????? '''

+??????? # This library function will raise the exception for us if

+??????? # necessary; if not, we can ignore the result returned.

+??????? hardware.get_mem_encryption_constraint(instance_type, image)

+

+

??? ?def _get_image_defined_bdms(self, instance_type, image_meta,

???????????????????????????????? root_device_name):

???????? image_properties = image_meta.get('properties', {})

@@ -3915,6 +3948,7 @@

???????? return self.compute_rpcapi.get_instance_diagnostics(context,

???????????????????????????????????????????????????????????? instance=instance)

?

+??? @reject_sev_instances(instance_actions.SUSPEND)

???? @check_instance_lock

???? @check_instance_cell

???? @check_instance_state(vm_state=[vm_states.ACTIVE])

@@ -4699,6 +4733,7 @@

????????????????????????????????????????????????????? diff=diff)

???????? return _metadata

?

+??? @reject_sev_instances(instance_actions.SUSPEND)

???? @check_instance_lock

???? @check_instance_cell

???? @check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.PAUSED])

diff -ruN nova-bak/exception.py nova/exception.py

--- nova-bak/exception.py??????? 2021-09-10 11:35:25.491284738 +0800

+++ nova/exception.py???? 2021-09-10 11:36:09.799787563 +0800

@@ -536,6 +536,10 @@

???? msg_fmt = _('Unable to migrate instance (%(instance_id)s) '

???????????????? 'to current host (%(host)s).')

?

+class OperationNotSupportedForSEV(NovaException):

+??? msg_fmt = _('Operation '%(operation)s' not supported for SEV-enabled '

+??????????????? 'instance (%(instance_uuid)s).')

+??? code = 409

?

?class InvalidHypervisorType(Invalid):

???? msg_fmt = _('The supplied hypervisor type of is invalid.')

diff -ruN nova-bak/objects/image_meta.py nova/objects/image_meta.py

--- nova-bak/objects/image_meta.py 2021-09-10 15:16:30.530628464 +0800

+++ nova/objects/image_meta.py????? 2021-09-10 15:19:26.999151245 +0800

@@ -177,6 +177,9 @@

???????? super(ImageMetaProps, self).obj_make_compatible(primitive,

???????????????????????????????????????????????????????? target_version)

???????? target_version = versionutils.convert_version_to_tuple(target_version)

+???????

+??????? if target_version < (1, 24):

+??????????? primitive.pop('hw_mem_encryption', None)

???????? if target_version < (1, 21):

???????????? primitive.pop('hw_time_hpet', None)

??? ?????if target_version < (1, 20):

@@ -298,6 +301,11 @@

???????? # is not practical to enumerate them all. So we use a free

???????? # form string

???????? 'hw_machine_type': fields.StringField(),

+???????

+??????? # boolean indicating that the guest needs to be booted with

+??????? # encrypted memory

+??????? 'hw_mem_encryption': fields.FlexibleBooleanField(),

+

?

???????? # One of the magic strings 'small', 'any', 'large'

???????? # or an explicit page size in KB (eg 4, 2048, ...)

diff -ruN nova-bak/scheduler/utils.py nova/scheduler/utils.py

--- nova-bak/scheduler/utils.py 2021-09-10 15:19:58.172561042 +0800

+++ nova/scheduler/utils.py????? 2021-09-10 15:35:05.630393147 +0800

@@ -35,7 +35,7 @@

?from nova.objects import instance as obj_instance

?from nova import rpc

?from nova.scheduler.filters import utils as filters_utils

-

+import nova.virt.hardware as hw

?

?LOG = logging.getLogger(__name__)

?

@@ -61,6 +61,27 @@

???????? # Default to the configured limit but _limit can be

???????? # set to None to indicate 'no limit'.

???????? self._limit = CONF.scheduler.max_placement_results

+??????? image = (request_spec.image if 'image' in request_spec

+???????????????? else objects.ImageMeta(properties=objects.ImageMetaProps()))

+??????? self._translate_memory_encryption(request_spec.flavor, image)

+

+??? def _translate_memory_encryption(self, flavor, image):

+??????? '''When the hw:mem_encryption extra spec or the hw_mem_encryption

+??????? image property are requested, translate into a request for

+??????? resources:MEM_ENCRYPTION_CONTEXT=1 which requires a slot on a

+??????? host which can support encryption of the guest memory.

+??????? '''

+??????? # NOTE(aspiers): In theory this could raise FlavorImageConflict,

+??????? # but we already check it in the API layer, so that should never

+??????? # happen.

+??????? if not hw.get_mem_encryption_constraint(flavor, image):

+??????????? # No memory encryption required, so no further action required.

+??????????? return

+

+??????? self._add_resource(None, orc.MEM_ENCRYPTION_CONTEXT, 1)

+??????? LOG.debug('Added %s=1 to requested resources',

+????????????????? orc.MEM_ENCRYPTION_CONTEXT)

+

?

???? def __str__(self):

???????? return ', '.join(sorted(

diff -ruN nova-bak/virt/hardware.py nova/virt/hardware.py

--- nova-bak/virt/hardware.py? 2022-02-23 10:45:42.320988102 +0800

+++ nova/virt/hardware.py??????? 2021-09-10 14:05:25.145572630 +0800

@@ -1140,6 +1140,67 @@

?

???? return flavor_policy, image_policy

?

+def get_mem_encryption_constraint(flavor, image_meta, machine_type=None):

+??? '''Return a boolean indicating whether encryption of guest memory was

+??? requested, either via the hw:mem_encryption extra spec or the

+??? hw_mem_encryption image property (or both).

+??? Also watch out for contradictory requests between the flavor and

+??? image regarding memory encryption, and raise an exception where

+??? encountered.? These conflicts can arise in two different ways:

+??????? 1) the flavor requests memory encryption but the image

+?????????? explicitly requests *not* to have memory encryption, or

+?????????? vice-versa

+??????? 2) the flavor and/or image request memory encryption, but the

+?????????? image is missing hw_firmware_type=uefi

+??????? 3) the flavor and/or image request memory encryption, but the

+?????????? machine type is set to a value which does not contain 'q35'

+??? This can be called from the libvirt driver on the compute node, in

+??? which case the driver should pass the result of

+??? nova.virt.libvirt.utils.get_machine_type() as the machine_type

+??? parameter, or from the API layer, in which case get_machine_type()

+??? cannot be called since it relies on being run from the compute

+??? node in order to retrieve CONF.libvirt.hw_machine_type.

+??? :param instance_type: Flavor object

+??? :param image: an ImageMeta object

+??? :param machine_type: a string representing the machine type (optional)

+??? :raises: nova.exception.FlavorImageConflict

+??? :raises: nova.exception.InvalidMachineType

+??? :returns: boolean indicating whether encryption of guest memory

+??? was requested

+??? '''

+

+??? flavor_mem_enc_str, image_mem_enc = _get_flavor_image_meta(

+??????? 'mem_encryption', flavor, image_meta)

+

+??? flavor_mem_enc = None

+??? if flavor_mem_enc_str is not None:

+??????? flavor_mem_enc = strutils.bool_from_string(flavor_mem_enc_str)

+

+??? # Image property is a FlexibleBooleanField, so coercion to a

+??? # boolean is handled automatically

+

+??? if not flavor_mem_enc and not image_mem_enc:

+??????? return False

+

+??? _check_for_mem_encryption_requirement_conflicts(

+?? ?????flavor_mem_enc_str, flavor_mem_enc, image_mem_enc, flavor, image_meta)

+

+??? # If we get this far, either the extra spec or image property explicitly

+??? # specified a requirement regarding memory encryption, and if both did,

+??? # they are asking for the same thing.

+??? requesters = []

+??? if flavor_mem_enc:

+??????? requesters.append('hw:mem_encryption extra spec in %s flavor' %

+????????????????????????? flavor.name)

+??? if image_mem_enc:

+??????? requesters.append('hw_mem_encryption property of image %s' %

+????????????????????????? image_meta.name)

+

+??? _check_mem_encryption_uses_uefi_image(requesters, image_meta)

+??? _check_mem_encryption_machine_type(image_meta, machine_type)

+

+??? LOG.debug('Memory encryption requested by %s', ' and '.join(requesters))

+??? return True

?

?def _get_numa_pagesize_constraint(flavor, image_meta):

???? '''Return the requested memory page size

A1.2

diff -ruN nova-bak/compute/manager.py nova/compute/manager.py

--- nova-bak/compute/manager.py? 2021-07-07 14:40:15.570807168 +0800

+++ nova/compute/manager.py??????? 2021-10-18 19:02:37.931655551 +0800

@@ -7013,7 +7013,8 @@

???????????????????????????????????????? migrate_data)

?

???????? # Detaching volumes.

-??????? connector = self.driver.get_volume_connector(instance)

+??????? connector = None

+??????? #connector = self.driver.get_volume_connector(instance)

???????? for bdm in source_bdms:

???????????? if bdm.is_volume:

???????????????? # Detaching volumes is a call to an external API that can fail.

@@ -7033,6 +7034,8 @@

???????????????????????? # remove the volume connection without detaching from

???????????????????????? # hypervisor because the instance is not running

???????????????????????? # anymore on the current host

+??????????????????????? if connector is None:

+??????????????????????????? connector = self.driver.get_volume_connector(instance)

???????????????????????? self.volume_api.terminate_connection(ctxt,

????????????????????????????????????????????????????????????? bdm.volume_id,

????????????????????????? ????????????????????????????????????connector)

@@ -7056,8 +7059,10 @@

?

???????? # Releasing vlan.

???????? # (not necessary in current implementation?)

-

-??????? network_info = self.network_api.get_instance_nw_info(ctxt, instance)

+???????

+??????? #changed by Fiona

+??????? #network_info = self.network_api.get_instance_nw_info(ctxt, instance)

+??????? network_info = instance.get_network_info()

?

???????? self._notify_about_instance_usage(ctxt, instance,

?????????????????????????????????????????? 'live_migration._post.start',

A1.3

diff -ruN neutron-bak/agent/l3/router_info.py neutron-iproute/agent/l3/router_info.py

--- neutron-bak/agent/l3/router_info.py??? 2020-12-14 18:00:23.683687327 +0800

+++ neutron-iproute/agent/l3/router_info.py???? 2022-02-23 15:18:15.650669589 +0800

@@ -748,8 +748,10 @@

???????? for ip_version in (lib_constants.IP_VERSION_4,

??????????????????????????? lib_constants.IP_VERSION_6):

???????????? gateway = device.route.get_gateway(ip_version=ip_version)

-??????????? if gateway and gateway.get('gateway'):

-??????????????? current_gateways.add(gateway.get('gateway'))

+#??????????? if gateway and gateway.get('gateway'):

+#??????????????? current_gateways.add(gateway.get('gateway'))

+??????????? if gateway and gateway.get('via'):

+???????????? ???current_gateways.add(gateway.get('via'))

???????? for ip in current_gateways - set(gateway_ips):

???????????? device.route.delete_gateway(ip)

???????? for ip in gateway_ips:

diff -ruN neutron-bak/agent/linux/ip_lib.py neutron-iproute/agent/linux/ip_lib.py

--- neutron-bak/agent/linux/ip_lib.py 2020-12-14 18:03:47.951878754 +0800

+++ neutron-iproute/agent/linux/ip_lib.py 2022-02-23 15:19:03.981457532 +0800

@@ -48,6 +48,8 @@

?????????????????? 'main': 254,

?????????????????? 'local': 255}

?

+IP_RULE_TABLES_NAMES = {v: k for k, v in IP_RULE_TABLES.items()}

+

?# Rule indexes: pyroute2.netlink.rtnl

?# Rule names: https://www.systutorials.com/docs/linux/man/8-ip-rule/

?# NOTE(ralonsoh): 'masquerade' type is printed as 'nat' in 'ip rule' command

@@ -592,14 +594,18 @@

???? def _dev_args(self):

???????? return ['dev', self.name] if self.name else []

?

-??? def add_gateway(self, gateway, metric=None, table=None):

-??????? ip_version = common_utils.get_ip_version(gateway)

-??????? args = ['replace', 'default', 'via', gateway]

-??????? if metric:

-??????????? args += ['metric', metric]

-??????? args += self._dev_args()

-??????? args += self._table_args(table)

-??????? self._as_root([ip_version], tuple(args))

+#??? def add_gateway(self, gateway, metric=None, table=None):

+#??????? ip_version = common_utils.get_ip_version(gateway)

+#??????? args = ['replace', 'default', 'via', gateway]

+#??????? if metric:

+#??????????? args += ['metric', metric]

+#??????? args += self._dev_args()

+#??????? args += self._table_args(table)

+#??????? self._as_root([ip_version], tuple(args))

+

+??? def add_gateway(self, gateway, metric=None, table=None, scope='global'):

+??????? self.add_route(None, via=gateway, table=table, metric=metric,

+?????????????????????? scope=scope)

?

???? def _run_as_root_detect_device_not_found(self, options, args):

???????? try:

@@ -618,41 +624,16 @@

???????? args += self._table_args(table)

???????? self._run_as_root_detect_device_not_found([ip_version], args)

?

-??? def _parse_routes(self, ip_version, output, **kwargs):

-??????? for line in output.splitlines():

-??????????? parts = line.split()

-

-??????????? # Format of line is: '|default [] ...'

-??????????? route = {k: v for k, v in zip(parts[1::2], parts[2::2])}

-??????????? route['cidr'] = parts[0]

-??????????? # Avoids having to explicitly pass around the IP version

-??????????? if route['cidr'] == 'default':

-??????????????? route['cidr'] = constants.IP_ANY[ip_version]

-

-??????????? # ip route drops things like scope and dev from the output if it

-??????????? # was specified as a filter.? This allows us to add them back.

-??????????? if self.name:

-??????????????? route['dev'] = self.name

-??????????? if self._table:

-??????????????? route['table'] = self._table

-??????????? # Callers add any filters they use as kwargs

-??????????? route.update(kwargs)

-

-??????????? yield route

-

-??? def list_routes(self, ip_version, **kwargs):

-??????? args = ['list']

-??????? args += self._dev_args()

-??????? args += self._table_args()

-??????? for k, v in kwargs.items():

-??????????? args += [k, v]

-

-??????? output = self._run([ip_version], tuple(args))

-??????? return [r for r in self._parse_routes(ip_version, output, **kwargs)]

+??? def list_routes(self, ip_version, scope=None, via=None, table=None,

+??????????????????? **kwargs):

+??????? table = table or self._table

+??????? return list_ip_routes(self._parent.namespace, ip_version, scope=scope,

+????????????????????????????? via=via, table=table, device=self.name, **kwargs)

?

???? def list_onlink_routes(self, ip_version):

???????? routes = self.list_routes(ip_version, scope='link')

-??????? return [r for r in routes if 'src' not in r]

+#??????? return [r for r in routes if 'src' not in r]

+??????? return [r for r in routes if not r['source_prefix']]

?

???? def add_onlink_route(self, cidr):

???????? self.add_route(cidr, scope='link')

@@ -660,34 +641,12 @@

???? def delete_onlink_route(self, cidr):

???????? self.delete_route(cidr, scope='link')

?

-??? def get_gateway(self, scope=None, filters=None, ip_version=None):

-??????? options = [ip_version] if ip_version else []

-

-??????? args = ['list']

-??????? args += self._dev_args()

-??????? args += self._table_args()

-??????? if filters:

-??????????? args += filters

-

-??????? retval = None

-

-??????? if scope:

-??????????? args += ['scope', scope]

-

-??????? route_list_lines = self._run(options, tuple(args)).split('\n')

-??????? default_route_line = next((x.strip() for x in

-?????????????????????????????????? route_list_lines if

-???????????????????????????? ??????x.strip().startswith('default')), None)

-??????? if default_route_line:

-??????????? retval = dict()

-??????????? gateway = DEFAULT_GW_PATTERN.search(default_route_line)

-??????????? if gateway:

-??????????????? retval.update(gateway=gateway.group(1))

-??????????? metric = METRIC_PATTERN.search(default_route_line)

-??????????? if metric:

-??????????????? retval.update(metric=int(metric.group(1)))

-

-??????? return retval

+??? def get_gateway(self, scope=None, table=None,

+??????????????????? ip_version=constants.IP_VERSION_4):

+??????? routes = self.list_routes(ip_version, scope=scope, table=table)

+??????? for route in routes:

+??????????? if route['via'] and route['cidr'] in constants.IP_ANY.values():

+??????????????? return route

?

???? def flush(self, ip_version, table=None, **kwargs):

???????? args = ['flush']

@@ -696,16 +655,11 @@

???????????? args += [k, v]

???????? self._as_root([ip_version], tuple(args))

?

-??? def add_route(self, cidr, via=None, table=None, **kwargs):

-??????? ip_version = common_utils.get_ip_version(cidr)

-??????? args = ['replace', cidr]

-??????? if via:

-??????????? args += ['via', via]

-??????? args += self._dev_args()

-??????? args += self._table_args(table)

-??????? for k, v in kwargs.items():

-??????????? args += [k, v]

-??????? self._run_as_root_detect_device_not_found([ip_version], args)

+??? def add_route(self, cidr, via=None, table=None, metric=None, scope=None,

+????????????????? **kwargs):

+??????? table = table or self._table

+??????? add_ip_route(self._parent.namespace, cidr, device=self.name, via=via,

+???????????????????? table=table, metric=metric, scope=scope, **kwargs)

?

???? def delete_route(self, cidr, via=None, table=None, **kwargs):

???????? ip_version = common_utils.get_ip_version(cidr)

@@ -1455,3 +1409,53 @@

???????????????? retval[device['vxlan_link_index']]['name'])

?

???? return list(retval.values())

+

+def add_ip_route(namespace, cidr, device=None, via=None, table=None,

+???????????????? metric=None, scope=None, **kwargs):

+??? '''Add an IP route'''

+??? if table:

+??????? table = IP_RULE_TABLES.get(table, table)

+??? ip_version = common_utils.get_ip_version(cidr or via)

+??? privileged.add_ip_route(namespace, cidr, ip_version,

+??????????????????????????? device=device, via=via, table=table,

+??????? ????????????????????metric=metric, scope=scope, **kwargs)

+

+

+def list_ip_routes(namespace, ip_version, scope=None, via=None, table=None,

+?????????????????? device=None, **kwargs):

+??? '''List IP routes'''

+??? def get_device(index, devices):

+??????? for device in (d for d in devices if d['index'] == index):

+??????????? return get_attr(device, 'IFLA_IFNAME')

+

+??? table = table if table else 'main'

+??? table = IP_RULE_TABLES.get(table, table)

+??? routes = privileged.list_ip_routes(namespace, ip_version, device=device,

+?????????????????????????????????????? table=table, **kwargs)

+??? devices = privileged.get_link_devices(namespace)

+??? ret = []

+??? for route in routes:

+??????? cidr = get_attr(route, 'RTA_DST')

+??????? if cidr:

+??????????? cidr = '%s/%s' % (cidr, route['dst_len'])

+??????? else:

+??????????? cidr = constants.IP_ANY[ip_version]

+??????? table = int(get_attr(route, 'RTA_TABLE'))

+??????? value = {

+??????????? 'table': IP_RULE_TABLES_NAMES.get(table, table),

+??????????? 'source_prefix': get_attr(route, 'RTA_PREFSRC'),

+??????????? 'cidr': cidr,

+??????????? 'scope': IP_ADDRESS_SCOPE[int(route['scope'])],

+??????????? 'device': get_device(int(get_attr(route, 'RTA_OIF')), devices),

+??????????? 'via': get_attr(route, 'RTA_GATEWAY'),

+??????????? 'priority': get_attr(route, 'RTA_PRIORITY'),

+??????? }

+

+??????? ret.append(value)

+

+??? if scope:

+??????? ret = [route for route in ret if route['scope'] == scope]

+??? if via:

+??????? ret = [route for route in ret if route['via'] == via]

+

+??? return ret

diff -ruN neutron-bak/cmd/sanity/checks.py neutron-iproute/cmd/sanity/checks.py

--- neutron-bak/cmd/sanity/checks.py?????? 2022-02-23 11:33:16.934132708 +0800

+++ neutron-iproute/cmd/sanity/checks.py??????? 2022-02-23 15:20:10.562018672 +0800

@@ -36,6 +36,7 @@

?from neutron.common import utils as common_utils

?from neutron.plugins.ml2.drivers.openvswitch.agent.common \

???? import constants as ovs_const

+from neutron.privileged.agent.linux import dhcp as priv_dhcp

?

?LOG = logging.getLogger(__name__)

?

@@ -230,8 +231,8 @@

?

?

?def dhcp_release6_supported():

-??? return runtime_checks.dhcp_release6_supported()

-

+#??? return runtime_checks.dhcp_release6_supported()

+???? return priv_dhcp.dhcp_release6_supported()

?

?def bridge_firewalling_enabled():

? ???for proto in ('arp', 'ip', 'ip6'):

@@ -363,7 +364,8 @@

?

???????????? default_gw = gw_dev.route.get_gateway(ip_version=6)

???????????? if default_gw:

-??????????????? default_gw = default_gw['gateway']

+#??????????????? default_gw = default_gw['gateway']

+??????????????? default_gw = default_gw['via']

?

???? return expected_default_gw == default_gw

?

diff -ruN neutron-bak/privileged/agent/linux/ip_lib.py neutron-iproute/privileged/agent/linux/ip_lib.py

--- neutron-bak/privileged/agent/linux/ip_lib.py 2020-12-14 18:26:08.339307939 +0800

+++ neutron-iproute/privileged/agent/linux/ip_lib.py 2022-02-23 15:20:39.477439105 +0800

@@ -634,3 +634,50 @@

???????? if e.errno == errno.ENOENT:

???????????? raise NetworkNamespaceNotFound(netns_name=namespace)

???????? raise

+

+@privileged.default.entrypoint

+@lockutils.synchronized('privileged-ip-lib')

+def add_ip_route(namespace, cidr, ip_version, device=None, via=None,

+???????????????? table=None, metric=None, scope=None, **kwargs):

+??? '''Add an IP route'''

+??? try:

+??????? with get_iproute(namespace) as ip:

+??????????? family = _IP_VERSION_FAMILY_MAP[ip_version]

+??????????? if not scope:

+??????????????? scope = 'global' if via else 'link'

+??????????? scope = _get_scope_name(scope)

+??????????? if cidr:

+????? ??????????kwargs['dst'] = cidr

+??????????? if via:

+??????????????? kwargs['gateway'] = via

+??????????? if table:

+??????????????? kwargs['table'] = int(table)

+??????????? if device:

+??????????????? kwargs['oif'] = get_link_id(device, namespace)

+???? ???????if metric:

+??????????????? kwargs['priority'] = int(metric)

+??????????? ip.route('replace', family=family, scope=scope, proto='static',

+???????????????????? **kwargs)

+??? except OSError as e:

+??????? if e.errno == errno.ENOENT:

+??????????? raise NetworkNamespaceNotFound(netns_name=namespace)

+??????? raise

+

+

+@privileged.default.entrypoint

+@lockutils.synchronized('privileged-ip-lib')

+def list_ip_routes(namespace, ip_version, device=None, table=None, **kwargs):

+??? '''List IP routes'''

+?? ?try:

+??????? with get_iproute(namespace) as ip:

+??????????? family = _IP_VERSION_FAMILY_MAP[ip_version]

+??????????? if table:

+??????????????? kwargs['table'] = table

+??????????? if device:

+??????????????? kwargs['oif'] = get_link_id(device, namespace)

+??????????? return make_serializable(ip.route('show', family=family, **kwargs))

+??? except OSError as e:

+??????? if e.errno == errno.ENOENT:

+??????????? raise NetworkNamespaceNotFound(netns_name=namespace)

+??????? raise

+

A1.4

diff -ruN neutron-bak/agent/linux/dhcp.py neutron-dhcprelease/agent/linux/dhcp.py

--- neutron-bak/agent/linux/dhcp.py 2020-12-15 09:59:29.966957908 +0800

+++ neutron-dhcprelease/agent/linux/dhcp.py? 2022-02-23 15:10:14.169101010 +0800

@@ -25,6 +25,7 @@

?from neutron_lib import constants

?from neutron_lib import exceptions

?from neutron_lib.utils import file as file_utils

+from oslo_concurrency import processutils

?from oslo_log import log as logging

?from oslo_utils import excutils

?from oslo_utils import fileutils

@@ -41,6 +42,7 @@

?from neutron.common import ipv6_utils

?from neutron.common import utils as common_utils

?from neutron.ipam import utils as ipam_utils

+from neutron.privileged.agent.linux import dhcp as priv_dhcp

?

?LOG = logging.getLogger(__name__)

?

@@ -476,7 +478,8 @@

?

???? def _is_dhcp_release6_supported(self):

???????? if self._IS_DHCP_RELEASE6_SUPPORTED is None:

-??????????? self._IS_DHCP_RELEASE6_SUPPORTED = checks.dhcp_release6_supported()

+??????????? self._IS_DHCP_RELEASE6_SUPPORTED = (

+??????????????? priv_dhcp.dhcp_release6_supported())

???????????? if not self._IS_DHCP_RELEASE6_SUPPORTED:

???????????????? LOG.warning('dhcp_release6 is not present on this system, '

???????????????????????????? 'will not call it again.')

@@ -485,24 +488,28 @@

???? def _release_lease(self, mac_address, ip, ip_version, client_id=None,

??????????????????????? server_id=None, iaid=None):

???????? '''Release a DHCP lease.'''

-??????? if ip_version == constants.IP_VERSION_6:

-??????????? if not self._is_dhcp_release6_supported():

-??????????????? return

-??????????? cmd = ['dhcp_release6', '--iface', self.interface_name,

-?????????????????? '--ip', ip, '--client-id', client_id,

-?????????????????? '--server-id', server_id, '--iaid', iaid]

-??????? else:

-??????????? cmd = ['dhcp_release', self.interface_name, ip, mac_address]

-??????????? if client_id:

-??????????????? cmd.append(client_id)

-??????? ip_wrapper = ip_lib.IPWrapper(namespace=self.network.namespace)

???????? try:

-??????????? ip_wrapper.netns.execute(cmd, run_as_root=True)

-??????? except RuntimeError as e:

+??????????? if ip_version == constants.IP_VERSION_6:

+??????????????? if not self._is_dhcp_release6_supported():

+??????????????????? return

+

+??????????????? params = {'interface_name': self.interface_name,

+??? ??????????????????????'ip_address': ip, 'client_id': client_id,

+????????????????????????? 'server_id': server_id, 'iaid': iaid,

+????????????????????????? 'namespace': self.network.namespace}

+??????????????? priv_dhcp.dhcp_release6(**params)

+?????????? ?else:

+??????????????? params = {'interface_name': self.interface_name,

+????????????????????????? 'ip_address': ip, 'mac_address': mac_address,

+????????????????????????? 'client_id': client_id,

+????????????????????????? 'namespace': self.network.namespace}

+#??????????????? LOG.info('Rock_DEBUG: DHCP release construct params %(params)s.', {'params': params})

+??????????????? priv_dhcp.dhcp_release(**params)

+??????? except (processutils.ProcessExecutionError, OSError) as e:

???????????? # when failed to release single lease there's

???????????? # no need to propagate error further

-??????????? LOG.warning('DHCP release failed for %(cmd)s. '

-??????????????????????? 'Reason: %(e)s', {'cmd': cmd, 'e': e})

+??????????? LOG.warning('DHCP release failed for params %(params)s. '

+??????????????????????? 'Reason: %(e)s', {'params': params, 'e': e})

?

???? def _output_config_files(self):

???????? self._output_hosts_file()

diff -ruN neutron-bak/cmd/sanity/checks.py neutron-dhcprelease/cmd/sanity/checks.py

--- neutron-bak/cmd/sanity/checks.py?????? 2022-02-23 11:33:16.934132708 +0800

+++ neutron-dhcprelease/cmd/sanity/checks.py 2022-02-23 15:11:07.536446402 +0800

@@ -36,6 +36,7 @@

?from neutron.common import utils as common_utils

?from neutron.plugins.ml2.drivers.openvswitch.agent.common \

???? import constants as ovs_const

+from neutron.privileged.agent.linux import dhcp as priv_dhcp

?

?LOG = logging.getLogger(__name__)

?

@@ -230,8 +231,8 @@

?

?

?def dhcp_release6_supported():

-??? return runtime_checks.dhcp_release6_supported()

-

+#??? return runtime_checks.dhcp_release6_supported()

+???? return priv_dhcp.dhcp_release6_supported()

?

?def bridge_firewalling_enabled():

???? for proto in ('arp', 'ip', 'ip6'):

@@ -363,7 +364,8 @@

?

???????????? default_gw = gw_dev.route.get_gateway(ip_version=6)

???????????? if default_gw:

-??????????????? default_gw = default_gw['gateway']

+#??????????????? default_gw = default_gw['gateway']

+??????????????? default_gw = default_gw['via']

?

???? return expected_default_gw == default_gw

?

diff -ruN neutron-bak/privileged/__init__.py neutron-dhcprelease/privileged/__init__.py

--- neutron-bak/privileged/__init__.py??????? 2020-04-23 14:45:14.000000000 +0800

+++ neutron-dhcprelease/privileged/__init__.py 2022-02-23 15:10:29.209584186 +0800

@@ -27,3 +27,11 @@

?????????????????? caps.CAP_DAC_OVERRIDE,

?????????????????? caps.CAP_DAC_READ_SEARCH],

?)

+

+dhcp_release_cmd = priv_context.PrivContext(

+??? __name__,

+??? cfg_section='privsep_dhcp_release',

+??? pypath=__name__ + '.dhcp_release_cmd',

+??? capabilities=[caps.CAP_SYS_ADMIN,

+????????????????? caps.CAP_NET_ADMIN]

+)

A1.5

diff -ruN neutron-bak/agent/linux/ipset_manager.py neutron/agent/linux/ipset_manager.py

--- neutron-bak/agent/linux/ipset_manager.py? 2022-02-16 15:11:40.419016919 +0800

+++ neutron/agent/linux/ipset_manager.py??????? 2022-02-16 15:17:02.328133786 +0800

@@ -146,7 +146,7 @@

???????????? cmd_ns.extend(['ip', 'netns', 'exec', self.namespace])

???????? cmd_ns.extend(cmd)

???????? self.execute(cmd_ns, run_as_root=True, process_input=input,

-???? ????????????????check_exit_code=fail_on_errors)

+???????????????????? check_exit_code=fail_on_errors, privsep_exec=True)

?

???? def _get_new_set_ips(self, set_name, expected_ips):

???????? new_member_ips = (set(expected_ips) -

diff -ruN neutron-bak/agent/linux/iptables_manager.py neutron/agent/linux/iptables_manager.py

--- neutron-bak/agent/linux/iptables_manager.py????? 2022-02-16 15:05:53.853147520 +0800

+++ neutron/agent/linux/iptables_manager.py?? 2021-07-07 14:59:16.000000000 +0800

@@ -475,12 +475,15 @@

???? ????args = ['iptables-save', '-t', table]

???????? if self.namespace:

???????????? args = ['ip', 'netns', 'exec', self.namespace] + args

-??????? return self.execute(args, run_as_root=True).split('\n')

+??????? #return self.execute(args, run_as_root=True).split('\n')

+??????? return self.execute(args, run_as_root=True,

+??????????????????????????? privsep_exec=True).split('\n')

?

???? def _get_version(self):

???????? # Output example is 'iptables v1.6.2'

???????? args = ['iptables', '--version']

-??????? version = str(self.execute(args, run_as_root=True).split()[1][1:])

+??????? #version = str(self.execute(args, run_as_root=True).split()[1][1:])

+??????? version = str(self.execute(args, run_as_root=True, privsep_exec=True).split()[1][1:])

???????? LOG.debug('IPTables version installed: %s', version)

???????? return version

?

@@ -505,8 +508,10 @@

???????????? args += ['-w', self.xlock_wait_time, '-W', XLOCK_WAIT_INTERVAL]

???????? try:

???????????? kwargs = {} if lock else {'log_fail_as_error': False}

+?????? ?????#self.execute(args, process_input='\n'.join(commands),

+??????????? #???????????? run_as_root=True, **kwargs)

???????????? self.execute(args, process_input='\n'.join(commands),

-???????????????????????? run_as_root=True, **kwargs)

+?????????????????? ??????run_as_root=True, privsep_exec=True, **kwargs)

???????? except RuntimeError as error:

???????????? return error

?

@@ -568,7 +573,8 @@

???????????? if self.namespace:

???????????????? args = ['ip', 'netns', 'exec', self.namespace] + args

???????????? try:

-??????????????? save_output = self.execute(args, run_as_root=True)

+??????????????? #save_output = self.execute(args, run_as_root=True)

+??????????????? save_output = self.execute(args, run_as_root=True, privsep_exec=True)

???????????? except RuntimeError:

???????????????? # We could be racing with a cron job deleting namespaces.

???????????????? # It is useless to try to apply iptables rules over and

@@ -769,7 +775,8 @@

???????????????? args.append('-Z')

???????????? if self.namespace:

????????????? ???args = ['ip', 'netns', 'exec', self.namespace] + args

-??????????? current_table = self.execute(args, run_as_root=True)

+??????????? #current_table = self.execute(args, run_as_root=True)

+??????????? current_table = self.execute(args, run_as_root=True, privsep_exec=True)

???????????? current_lines = current_table.split('\n')

?

???????????? for line in current_lines[2:]:

diff -ruN neutron-bak/agent/linux/utils.py neutron/agent/linux/utils.py

--- neutron-bak/agent/linux/utils.py 2022-02-16 15:06:03.133090388 +0800

+++ neutron/agent/linux/utils.py?????? 2021-07-08 09:34:12.000000000 +0800

@@ -38,6 +38,7 @@

?from neutron.agent.linux import xenapi_root_helper

?from neutron.common import utils

?from neutron.conf.agent import common as config

+from neutron.privileged.agent.linux import utils as priv_utils

?from neutron import wsgi

?

?

@@ -85,13 +86,24 @@

???? if run_as_root:

???????? cmd = shlex.split(config.get_root_helper(cfg.CONF)) + cmd

???? LOG.debug('Running command: %s', cmd)

-??? obj = utils.subprocess_popen(cmd, shell=False,

-???????????????????????????????? stdin=subprocess.PIPE,

-???????????????????????????????? stdout=subprocess.PIPE,

-???????????????????????????????? stderr=subprocess.PIPE)

+??? #obj = utils.subprocess_popen(cmd, shell=False,

+??? #??????? ?????????????????????stdin=subprocess.PIPE,

+??? #???????????????????????????? stdout=subprocess.PIPE,

+??? #???????????????????????????? stderr=subprocess.PIPE)

+??? obj = subprocess.Popen(cmd, shell=False, stdin=subprocess.PIPE,

+??????????????????????? ???stdout=subprocess.PIPE, stderr=subprocess.PIPE)

?

???? return obj, cmd

?

+def _execute_process(cmd, _process_input, addl_env, run_as_root):

+??? obj, cmd = create_process(cmd, run_as_root=run_as_root, addl_env=addl_env)

+??? _stdout, _stderr = obj.communicate(_process_input)

+??? returncode = obj.returncode

+??? obj.stdin.close()

+??? _stdout = helpers.safe_decode_utf8(_stdout)

+??? _stderr = helpers.safe_decode_utf8(_stderr)

+??? return _stdout, _stderr, returncode

+

?

?def execute_rootwrap_daemon(cmd, process_input, addl_env):

???? cmd = list(map(str, addl_env_args(addl_env) + cmd))

@@ -103,31 +115,45 @@

???? LOG.debug('Running command (rootwrap daemon): %s', cmd)

???? client = RootwrapDaemonHelper.get_client()

???? try:

-??????? return client.execute(cmd, process_input)

+??????? #return client.execute(cmd, process_input)

+??????? returncode, __stdout, _stderr =? client.execute(cmd, process_input)

???? except Exception:

???????? with excutils.save_and_reraise_exception():

???????????? LOG.error('Rootwrap error running command: %s', cmd)

+??? _stdout = helpers.safe_decode_utf8(_stdout)

+??? _stderr = helpers.safe_decode_utf8(_stderr)

+??? return _stdout, _stderr, returncode

?

?

?def execute(cmd, process_input=None, addl_env=None,

???????????? check_exit_code=True, return_stderr=False, log_fail_as_error=True,

-??????????? extra_ok_codes=None, run_as_root=False):

+??????????? extra_ok_codes=None, run_as_root=False, privsep_exec=False):

???? try:

???????? if process_input is not None:

???????????? _process_input = encodeutils.to_utf8(process_input)

???????? else:

???????????? _process_input = None

-??????? if run_as_root and cfg.CONF.AGENT.root_helper_daemon:

-??????????? returncode, _stdout, _stderr = (

-??????????????? execute_rootwrap_daemon(cmd, process_input, addl_env))

+??????? #if run_as_root and cfg.CONF.AGENT.root_helper_daemon:

+??????? #??? returncode, _stdout, _stderr = (

+??????? #??????? execute_rootwrap_daemon(cmd, process_input, addl_env))

+??????? #else:

+??????? #??? obj, cmd = create_process(cmd, run_as_root=run_as_root,

+??????? #????????????????????????????? addl_env=addl_env)

+??????? #??? _stdout, _stderr = obj.communicate(_process_input)

+??????? #??? returncode = obj.returncode

+??????? #??? obj.stdin.close()

+??????? #_stdout = helpers.safe_decode_utf8(_stdout)

+??????? #_stderr = helpers.safe_decode_utf8(_stderr)

+

+??????? if run_as_root and privsep_exec:

+??????????? _stdout, _stderr, returncode = priv_utils.execute_process(

+??????????????? cmd, _process_input, addl_env)

+??????? elif run_as_root and cfg.CONF.AGENT.root_helper_daemon:

+??????????? _stdout, _stderr, returncode = execute_rootwarp_daemon(

+??????????????? cmd, process_input, addl_env)

???????? else:

-??????????? obj, cmd = create_process(cmd, run_as_root=run_as_root,

-?? ???????????????????????????????????addl_env=addl_env)

-??????????? _stdout, _stderr = obj.communicate(_process_input)

-??????????? returncode = obj.returncode

-??????????? obj.stdin.close()

-??????? _stdout = helpers.safe_decode_utf8(_stdout)

-??????? _stderr = helpers.safe_decode_utf8(_stderr)

+??????????? _stdout, _stderr, returncode = _execute_process(

+??????????????? cmd, _process_input, addl_env, run_as_root)

?

???????? extra_ok_codes = extra_ok_codes or []

???????? if returncode and returncode not in extra_ok_codes:

diff -ruN neutron-bak/cmd/ipset_cleanup.py neutron/cmd/ipset_cleanup.py

--- neutron-bak/cmd/ipset_cleanup.py????? 2022-02-16 15:18:00.727786180 +0800

+++ neutron/cmd/ipset_cleanup.py?? 2021-07-07 15:00:03.000000000 +0800

@@ -38,7 +38,8 @@

?def remove_iptables_reference(ipset):

???? # Remove any iptables reference to this IPset

???? cmd = ['iptables-save'] if 'IPv4' in ipset else ['ip6tables-save']

-??? iptables_save = utils.execute(cmd, run_as_root=True)

+??? #iptables_save = utils.execute(cmd, run_as_root=True)

+??? iptables_save = utils.execute(cmd, run_as_root=True, privsep_exec=True)

?

???? if ipset in iptables_save:

???????? cmd = ['iptables'] if 'IPv4' in ipset else ['ip6tables']

@@ -50,7 +51,8 @@

? ???????????????params = rule.split()

???????????????? params[0] = '-D'

???????????????? try:

-??????????????????? utils.execute(cmd + params, run_as_root=True)

+??????????????????? #utils.execute(cmd + params, run_as_root=True)

+??????????????????? utils.execute(cmd + params, run_as_root=True, privsep_exec=True)

???????????????? except Exception:

???????????????????? LOG.exception('Error, unable to remove iptables rule '

?????????????????????????????????? 'for IPset: %s', ipset)

@@ -65,7 +67,8 @@

???? LOG.info('Destroying IPset: %s', ipset)

???? cmd = ['ipset', 'destroy', ipset]

???? try:

-??????? utils.execute(cmd, run_as_root=True)

+??????? #utils.execute(cmd, run_as_root=True)

+??????? utils.execute(cmd, run_as_root=True, privsep_exec=True)

???? except Exception:

???????? LOG.exception('Error, unable to destroy IPset: %s', ipset)

?

@@ -75,7 +78,8 @@

???? LOG.info('Destroying IPsets with prefix: %s', conf.prefix)

?

???? cmd = ['ipset', '-L', '-n']

-??? ipsets = utils.execute(cmd, run_as_root=True)

+??? #ipsets = utils.execute(cmd, run_as_root=True)

+??? ipsets = utils.execute(cmd, run_as_root=True, privsep_exec=True)

???? for ipset in ipsets.split('\n'):

???????? if conf.allsets or ipset.startswith(conf.prefix):

???????????? destroy_ipset(conf, ipset)

diff -ruN neutron-bak/privileged/agent/linux/utils.py neutron/privileged/agent/linux/utils.py

--- neutron-bak/privileged/agent/linux/utils.py? 1970-01-01 08:00:00.000000000 +0800

+++ neutron/privileged/agent/linux/utils.py??????? 2021-07-07 14:58:21.000000000 +0800

@@ -0,0 +1,82 @@

+# Copyright 2020 Red Hat, Inc.

+#

+#??? Licensed under the Apache License, Version 2.0 (the 'License'); you may

+#??? not use this file except in compliance with the License. You may obtain

+#??? a copy of the License at

+#

+#???????? http://www.apache.org/licenses/LICENSE-2.0

+#

+#??? Unless required by applicable law or agreed to in writing, software

+#??? distributed under the License is distributed on an 'AS IS' BASIS, WITHOUT

+#??? WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the

+#??? License for the specific language governing permissions and limitations

+#??? under the License.

+

+import os

+import re

+

+from eventlet.green import subprocess

+from neutron_lib.utils import helpers

+from oslo_concurrency import processutils

+from oslo_utils import fileutils

+

+from neutron import privileged

+

+

+NETSTAT_PIDS_REGEX = re.compile(r'.* (?P\d{2,6})/.*')

+

+

+@privileged.default.entrypoint

+def find_listen_pids_namespace(namespace):

+??? return _find_listen_pids_namespace(namespace)

+

+

+def _find_listen_pids_namespace(namespace):

+??? '''Retrieve a list of pids of listening processes within the given netns

+??? This method is implemented separately to allow unit testing.

+??? '''

+??? pids = set()

+??? cmd = ['ip', 'netns', 'exec', namespace, 'netstat', '-nlp']

+??? output = processutils.execute(*cmd)

+??? for line in output[0].splitlines():

+??????? m = NETSTAT_PIDS_REGEX.match(line)

+??????? if m:

+??????????? pids.add(m.group('pid'))

+??? return list(pids)

+

+

+@privileged.default.entrypoint

+def delete_if_exists(path, remove=os.unlink):

+??? fileutils.delete_if_exists(path, remove=remove)

+

+

+@privileged.default.entrypoint

+def execute_process(cmd, _process_input, addl_env):

+??? obj, cmd = _create_process(cmd, addl_env=addl_env)

+??? _stdout, _stderr = obj.communicate(_process_input)

+??? returncode = obj.returncode

+??? obj.stdin.close()

+??? _stdout = helpers.safe_decode_utf8(_stdout)

+??? _stderr = helpers.safe_decode_utf8(_stderr)

+??? return _stdout, _stderr, returncode

+

+

+def _addl_env_args(addl_env):

+??? '''Build arguments for adding additional environment vars with env'''

+

+??? # NOTE (twilson) If using rootwrap, an EnvFilter should be set up for the

+??? # command instead of a CommandFilter.

+??? if addl_env is None:

+??????? return []

+??? return ['env'] + ['%s=%s' % pair for pair in addl_env.items()]

+

+

+def _create_process(cmd, addl_env=None):

+??? '''Create a process object for the given command.

+??? The return value will be a tuple of the process object and the

+??? list of command arguments used to create it.

+??? '''

+??? cmd = list(map(str, _addl_env_args(addl_env) + list(cmd)))

+??? obj = subprocess.Popen(cmd, shell=False, stdin=subprocess.PIPE,

+?????????????????????????? stdout=subprocess.PIPE, stderr=subprocess.PIPE)

+??? return obj, cmd

B1.1

--- neutron-bak/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py? 2022-08-02 17:02:51.213224245 +0800

+++ neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py 2022-08-02 17:02:09.181883012 +0800

@@ -161,8 +161,8 @@

???????? self.enable_distributed_routing = agent_conf.enable_distributed_routing

???????? self.arp_responder_enabled = agent_conf.arp_responder and self.l2_pop

?

-??????? host = self.conf.host

-??????? self.agent_id = 'ovs-agent-%s' % host

+??????? self.host = self.conf.host

+??????? self.agent_id = 'ovs-agent-%s' % self.host

?

???????? self.enable_tunneling = bool(self.tunnel_types)

?

@@ -245,7 +245,7 @@

???????????? self.phys_ofports,

???????????? self.patch_int_ofport,

???????????? self.patch_tun_ofport,

-??????????? host,

+??????????? self.host,

???????????? self.enable_tunneling,

???????????? self.enable_distributed_routing,

???????????? self.arp_responder_enabled)

@@ -289,7 +289,7 @@

???????? #?????????????? ???or which are used by specific extensions.

???????? self.agent_state = {

???????????? 'binary': 'neutron-openvswitch-agent',

-??????????? 'host': host,

+??????????? 'host': self.host,

???????????? 'topic': n_const.L2_AGENT_TOPIC,

???????????? 'configurations': {'bridge_mappings': self.bridge_mappings,

??????????????????????????????? c_const.RP_BANDWIDTHS: self.rp_bandwidths,

@@ -1671,6 +1671,7 @@

???????? skipped_devices = []

???????? need_binding_devices = []

???????? binding_no_activated_devices = set()

+??????? migrating_devices = set()

???????? agent_restarted = self.iter_num == 0

???????? devices_details_list = (

???????????? self.plugin_rpc.get_devices_details_list_and_failed_devices(

@@ -1696,6 +1697,12 @@

???????????????? skipped_devices.append(device)

???????????????? continue

?

+??????????? migrating_to = details.get('migrating_to')

+??????????? if migrating_to and migrating_to != self.host:

+??????????????? LOG.info('Port %(device)s is being migrated to host %(host)s.',

+???????????????????????? {'device': device, 'host': migrating_to})

+??????????????? migrating_devices.add(device)

+

???????????? if 'port_id' in details:

???????????????? LOG.info('Port %(device)s updated. Details: %(details)s',

????????????????????????? {'device': device, 'details': details})

@@ -1729,7 +1736,7 @@

???????????????? if (port and port.ofport != -1):

???????????????????? self.port_dead(port)

???????? return (skipped_devices, binding_no_activated_devices,

-??????????????? need_binding_devices, failed_devices)

+??????? ????????need_binding_devices, failed_devices, migrating_devices)

?

???? def _update_port_network(self, port_id, network_id):

???????? self._clean_network_ports(port_id)

@@ -1821,10 +1828,12 @@

???????? need_binding_devices = []

???????? skipped_devices = set()

???????? binding_no_activated_devices = set()

+??????? migrating_devices = set()

???????? start = time.time()

???????? if devices_added_updated:

???????????? (skipped_devices, binding_no_activated_devices,

-???????????? need_binding_devices, failed_devices['added']) = (

+???????????? need_binding_devices, failed_devices['added'],

+??????????????? migrating_devices) = (

???????????????? self.treat_devices_added_or_updated(

???????????????????? devices_added_updated, provisioning_needed))

???????????? LOG.debug('process_network_ports - iteration:%(iter_num)d - '

@@ -1847,7 +1856,7 @@

???????? # TODO(salv-orlando): Optimize avoiding applying filters

???????? # unnecessarily, (eg: when there are no IP address changes)

???????? added_ports = (port_info.get('added', set()) - skipped_devices -

-?????????????????????? binding_no_activated_devices)

+?????????????????????? binding_no_activated_devices - migrating_devices)

???????? self._add_port_tag_info(need_binding_devices)

???????? self.sg_agent.setup_port_filters(added_ports,

????????????????????????????????????????? port_info.get('updated', set()))

B1.2

--- neutron-bak/conf/common.py???? 2022-08-02 17:07:18.239265163 +0800

+++ neutron/conf/common.py 2021-09-08 17:08:59.000000000 +0800

@@ -166,6 +166,24 @@

??????????????? help=_('Type of the nova endpoint to use.? This endpoint will'

?????????????????????? ' be looked up in the keystone catalog and should be'

?????????????????????? ' one of public, internal or admin.')),

+??? cfg.BoolOpt('live_migration_events', default=True,

+ ???????????????help=_('When this option is enabled, during the live '

+?????????????????????? 'migration, the OVS agent will only send the '

+?????????????????????? ''vif-plugged-event' when the destination host '

+?????????????????????? 'interface is bound. This option also disables any '

+?????????????????????? 'other agent (like DHCP) to send to Nova this event '

+?????????????????????? 'when the port is provisioned.'

+?????????????????????? 'This option can be enabled if Nova patch '

+????????????????? ?????'https://review.opendev.org/c/openstack/nova/+/767368 '

+?????????????????????? 'is in place.'

+?????????????????????? 'This option is temporary and will be removed in Y and '

+?????????????????????? 'the behavior will be 'True'.'),

+??????????????? deprecated_for_removal=True,

+??????????????? deprecated_reason=(

+??????????????????? 'In Y the Nova patch '

+??????????????????? 'https://review.opendev.org/c/openstack/nova/+/767368 '

+??????????????????? 'will be in the code even when running a Nova server in '

+??????????????????? 'X.')),

?]

B1.3

--- neutron-bak/agent/rpc.py?? 2021-08-25 15:29:11.000000000 +0800

+++ neutron/agent/rpc.py 2021-09-15 16:34:09.000000000 +0800

@@ -25,8 +25,10 @@

?from neutron_lib import constants

?from neutron_lib.plugins import utils

?from neutron_lib import rpc as lib_rpc

+from oslo_config import cfg

?from oslo_log import log as logging

?import oslo_messaging

+from oslo_serialization import jsonutils

?from oslo_utils import uuidutils

?

?from neutron.agent import resource_cache

@@ -323,8 +325,10 @@

???????? binding = utils.get_port_binding_by_status_and_host(

???????????? port_obj.bindings, constants.ACTIVE, raise_if_not_found=True,

???????????? port_id=port_obj.id)

-??????? if (port_obj.device_owner.startswith(

-???????????? ???constants.DEVICE_OWNER_COMPUTE_PREFIX) and

+??????? migrating_to = migrating_to_host(port_obj.bindings)

+??????? if (not (migrating_to and cfg.CONF.nova.live_migration_events) and

+??????????????? port_obj.device_owner.startswith(

+??????????????????? constants.DEVICE_OWNER_COMPUTE_PREFIX) and

???????????????? binding[pb_ext.HOST] != host):

???????????? LOG.debug('Device %s has no active binding in this host',

?????????????????????? port_obj)

@@ -357,7 +361,8 @@

???????????? 'qos_policy_id': port_obj.qos_policy_id,

???????????? 'network_qos_policy_id': net_qos_policy_id,

???????????? 'profile': binding.profile,

-??????????? 'security_groups': list(port_obj.security_group_ids)

+??????????? 'security_groups': list(port_obj.security_group_ids),

+??????????? 'migrating_to': migrating_to,

???????? }

???????? LOG.debug('Returning: %s', entry)

???????? return entry

@@ -365,3 +370,40 @@

???? def get_devices_details_list(self, context, devices, agent_id, host=None):

???????? return [self.get_device_details(context, device, agent_id, host)

???????????????? for device in devices]

+

+# TODO(ralonsoh): move this method to neutron_lib.plugins.utils

+def migrating_to_host(bindings, host=None):

+??? '''Return the host the port is being migrated.

+

+??? If the host is passed, the port binding profile with the 'migrating_to',

+??? that contains the host the port is being migrated, is compared to this

+??? value. If no value is passed, this method will return if the port is

+??? being migrated ('migrating_to' is present in any port binding profile).

+

+??? The function returns None or the matching host.

+??? '''

+??? #LOG.info('LiveDebug: enter migrating_to_host? 001')

+??? for binding in (binding for binding in bindings if

+??????????????????? binding[pb_ext.STATUS] == constants.ACTIVE):

+??????? profile = binding.get('profile')

+??????? if not profile:

+??????????? continue

+?????? '''

+??????? profile = (jsonutils.loads(profile) if isinstance(profile, str) else

+?????????????????? profile)

+??????? migrating_to = profile.get('migrating_to')

+?????? '''

+??????? # add by michael

+??????? if isinstance(profile, str):

+??????????? migrating_to = jsonutils.loads(profile).get('migrating_to')

+??????????? #LOG.info('LiveDebug: migrating_to_host 001? migrating_to: %s', migrating_to)

+??????? else:

+??????????? migrating_to = profile.get('migrating_to')

+??????????? #LOG.info('LiveDebug: migrating_to_host 002? migrating_to: %s', migrating_to)

+

+??????? if migrating_to:

+??????????? if not host:? # Just know if the port is being migrated.

+?????? ?????????return migrating_to

+??????????? if migrating_to == host:

+??????????????? return migrating_to

+??? return None

B1.4

--- neutron-bak/db/provisioning_blocks.py??????? 2021-08-25 15:43:47.000000000 +0800

+++ neutron/db/provisioning_blocks.py???? 2021-09-03 09:32:41.000000000 +0800

@@ -137,8 +137,7 @@

???????????? context, standard_attr_id=standard_attr_id):

???????? LOG.debug('Provisioning complete for %(otype)s %(oid)s triggered by '

?????????????????? 'entity %(entity)s.', log_dict)

-??????? registry.notify(object_type, PROVISIONING_COMPLETE,

-??????????????????????? 'neutron.db.provisioning_blocks',

+??????? registry.notify(object_type, PROVISIONING_COMPLETE, entity,

???????????????????????? context=context, object_id=object_id)

?

B1.5

--- neutron-bak/notifiers/nova.py???? 2021-08-25 16:02:33.000000000 +0800

+++ neutron/notifiers/nova.py? 2021-09-03 09:32:41.000000000 +0800

@@ -13,6 +13,8 @@

?#??? License for the specific language governing permissions and limitations

?#??? under the License.

?

+import contextlib

+

?from keystoneauth1 import loading as ks_loading

?from neutron_lib.callbacks import events

?from neutron_lib.callbacks import registry

@@ -66,6 +68,16 @@

???????????? if ext.name == 'server_external_events']

???????? self.batch_notifier = batch_notifier.BatchNotifier(

???????????? cfg.CONF.send_events_interval, self.send_events)

+??????? self._enabled = True

+

+??? @contextlib.contextmanager

+??? def context_enabled(self, enabled):

+??????? stored_enabled = self._enabled

+??????? try:

+??????????? self._enabled = enabled

+??????????? yield

+??????? finally:

+??????????? self._enabled = stored_enabled

?

???? def _get_nova_client(self):

???????? global_id = common_context.generate_request_id()

@@ -163,6 +175,10 @@

???????????????? return self._get_network_changed_event(port)

?

???? def _can_notify(self, port):

+??????? if not self._enabled:

+??????????? LOG.debug('Nova notifier disabled')

+??????????? return False

+

???????? if not port.id:

???????????? LOG.warning('Port ID not set! Nova will not be notified of '

???????????????????????? 'port status change.')

B1.6

--- nova-bak/compute/manager.py? 2022-08-02 16:27:45.943428128 +0800

+++ nova/compute/manager.py??????? 2021-09-03 09:35:24.529858458 +0800

@@ -6637,12 +6637,12 @@

???????? LOG.error(msg, msg_args)

?

???? @staticmethod

-??? def _get_neutron_events_for_live_migration(instance):

+??? def _get_neutron_events_for_live_migration(instance, migration):

???????? # We don't generate events if CONF.vif_plugging_timeout=0

???????? # meaning that the operator disabled using them.

-??????? if CONF.vif_plugging_timeout and utils.is_neutron():

-??????????? return [('network-vif-plugged', vif['id'])

-??????????????????? for vif in instance.get_network_info()]

+??????? if CONF.vif_plugging_timeout:

+??????????? return (instance.get_network_info()

+??????????????????? .get_live_migration_plug_time_events())

???????? else:

???????????? return []

?

@@ -6695,7 +6695,8 @@

???????????? '''

???????????? pass

?

-??????? events = self._get_neutron_events_for_live_migration(instance)

+??????? events = self._get_neutron_events_for_live_migration(

+??????????? instance, migration)

???????? try:

???????????? if ('block_migration' in migrate_data and

???????????????????? migrate_data.block_migration):

B1.7

--- nova-bak/network/model.py??????? 2022-08-02 16:27:47.490437859 +0800

+++ nova/network/model.py???? 2021-09-03 09:35:24.532858440 +0800

@@ -469,6 +469,14 @@

???????? return (self.is_hybrid_plug_enabled() and not

???????????????? migration.is_same_host())

?

+??? @property

+??? def has_live_migration_plug_time_event(self):

+??????? '''Returns whether this VIF's network-vif-plugged external event will

+??????? be sent by Neutron at 'plugtime' - in other words, as soon as neutron

+??????? completes configuring the network backend.

+??????? '''

+????? ??return self.is_hybrid_plug_enabled()

+

???? def is_hybrid_plug_enabled(self):

???????? return self['details'].get(VIF_DETAILS_OVS_HYBRID_PLUG, False)

?

@@ -527,20 +535,26 @@

???????? return jsonutils.dumps(self)

?

???? def get_bind_time_events(self, migration):

-??????? '''Returns whether any of our VIFs have 'bind-time' events. See

-??????? has_bind_time_event() docstring for more details.

+??????? '''Returns a list of external events for any VIFs that have

+??????? 'bind-time' events during cold migration.

???????? '''

???????? return [('network-vif-plugged', vif['id'])

???????????????? for vif in self if vif.has_bind_time_event(migration)]

?

+??? def get_live_migration_plug_time_events(self):

+??????? '''Returns a list of external events for any VIFs that have

+??????? 'plug-time' events during live migration.

+??????? '''

+??????? return [('network-vif-plugged', vif['id'])

+??????????????? for vif in self if vif.has_live_migration_plug_time_event]

+

???? def get_plug_time_events(self, migration):

-???? ???'''Complementary to get_bind_time_events(), any event that does not

-??????? fall in that category is a plug-time event.

+??????? '''Returns a list of external events for any VIFs that have

+??????? 'plug-time' events during cold migration.

???????? '''

???????? return [('network-vif-plugged', vif['id'])

???????????????? for vif in self if not vif.has_bind_time_event(migration)]

?

-

?class NetworkInfoAsyncWrapper(NetworkInfo):

???? '''Wrapper around NetworkInfo that allows retrieving NetworkInfo

???? in an async manner.

C1.1

--- linux-3.10.0-1062.18.1.el7.orig/net/ipv4/udp_offload.c???? 2020-02-12 21:45:22.000000000 +0800

+++ linux-3.10.0-1062.18.1.el7/net/ipv4/udp_offload.c?? 2022-08-17 15:56:27.540557289 +0800

@@ -261,7 +261,7 @@ struct sk_buff **udp_gro_receive(struct

????? struct sock *sk;

?

????? if (NAPI_GRO_CB(skb)->encap_mark ||

-???? ??? (skb->ip_summed != CHECKSUM_PARTIAL &&

+???? ??? (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&

????? ???? NAPI_GRO_CB(skb)->csum_cnt == 0 &&

????? ???? !NAPI_GRO_CB(skb)->csum_valid))

???????????? goto out;

È«ÎÄ¿¢ÊÂ