ÆÄÀÌÅäÄ¡¿Í À¯´ÏƼ ML-Agents·Î ¹è¿ì´Â °ÈÇнÀ
- ÀúÀÚ<¹Î±Ô½Ä>,<ÀÌÇöÈ£>,<±è¿µ·Ï>,<Á¤À¯Á¤>,<Á¤±Ô¿>,<¹ÚÀ¯¹Î> °øÀú
- ÃâÆÇ»çÀ§Å°ºÏ½º
- ÃâÆÇÀÏ2022-08-17
- µî·ÏÀÏ2023-01-17
º¸À¯ 1, ´ëÃâ 0,
¿¹¾à 0, ´©Àû´ëÃâ 9, ´©Àû¿¹¾à 4
Ã¥¼Ò°³
À¯´ÏƼ¸¦ ÀÌ¿ëÇÏ¿© Á÷Á¢ °ÔÀÓÀ» Á¦ÀÛÇÏ°í ML-Agents·Î °ÈÇнÀ ȯ°æÀ» ±¸¼ºÇÒ ¼ö ÀÖ½À´Ï´Ù!À¯´ÏƼ ML-Agents´Â °ÔÀÓ ¿£ÁøÀÎ À¯´ÏƼ¸¦ ÅëÇØ Á¦ÀÛÇÑ ½Ã¹Ä·¹ÀÌ¼Ç È¯°æÀ» °ÈÇнÀÀ» À§ÇÑ È¯°æÀ¸·Î ¸¸µé¾îÁÖ´Â °í¸¶¿î µµ±¸ÀÌ´Ù. ML-Agents¸¦ ÅëÇØ ¸¹Àº °³¹ßÀÚ, ¿¬±¸ÀÚµéÀÌ ¿øÇÏ´Â °ÈÇнÀ ȯ°æÀ» Á÷Á¢ ¸¸µé ¼ö ÀÖ°Ô µÇ¸é¼ ML-Agents´Â ÇмúÀû, »ê¾÷ÀûÀ¸·Î °ÈÇнÀÀÇ »ç¿ë¿¡ ÀÖ¾î Áß¿äÇÑ µµ±¸°¡ µÇ¾ú´Ù. ÇÏÁö¸¸ ¾ÆÁ÷±îÁöµµ ML-Agents, ±×Áß¿¡¼µµ ƯÈ÷ ML-Agents 2.0 ÀÌÈÄÀÇ ¹öÀüÀ» ´Ù·ç´Â Âü°í ÀÚ·á°¡ ¸¹Áö ¾Ê±â ¶§¹®¿¡ ML-Agents¸¦ »ç¿ëÇÏ´Â µ¥ ¾î·Á¿òÀÌ ¸¹¾Ò´Ù. ÀÌ Ã¥Àº À¯´ÏƼ, ML-Agents, ½ÉÃþ°ÈÇнÀ µî À¯´ÏƼ ML-Agents¸¦ »ç¿ëÇÏ´Â µ¥ ÇÊ¿äÇÑ ´Ù¾çÇÑ ³»¿ëÀ» ´Ù·é´Ù. ¶ÇÇÑ ÀÌ Ã¥Àº 2020³â Ãâ°£µÈ ¡ºÅÙ¼ÇÃ·Î¿Í À¯´ÏƼ ML-Agents·Î ¹è¿ì´Â °ÈÇнÀ¡»ÀÇ °³Á¤ÆÇÀ¸·Î ÃֽŹöÀüÀÇ ML-Agents¿¡ ´ëÇÑ ³»¿ëÀ» ´Ù·ç°í ÀÖ´Ù.
ÀúÀÚ¼Ò°³
ÇѾç´ëÇб³ ¹Ì·¡ÀÚµ¿Â÷°øÇаú¿¡¼ ¹Ú»çÇÐÀ§¸¦ ÃëµæÇßÀ¸¸ç ÇöÀç Ä«Ä«¿À¿¡¼ AI ¿£Áö´Ï¾î·Î ÀÏÇÏ°í ÀÖ´Ù. °ÈÇнÀ °ü·Ã ÆäÀ̽ººÏ ±×·ìÀÎ Reinforcement Learning KoreaÀÇ ¿î¿µÁøÀ¸·Î È°µ¿ÇÏ°í ÀÖÀ¸¸ç À¯´ÏƼ ÄÚ¸®¾Æ¿¡¼ °øÀÎÇÑ À¯´ÏƼ Àü¹®°¡ ±×·ìÀÎ Unity Masters 3~5±â·Î È°µ¿Çß´Ù.
¸ñÂ÷
¢Ã 1Àå: °ÈÇнÀÀÇ °³¿ä1.1 °ÈÇнÀÀ̶õ? ___1.1.1 ±â°èÇнÀÀ̶õ? ___1.1.2 °ÈÇнÀÀÇ ¼º°ú 1.2 °ÈÇнÀÀÇ ±âÃÊ ¿ë¾î 1.3 °ÈÇнÀÀÇ ±âÃÊ ÀÌ·Ð___1.3.1 º§¸¸ ¹æÁ¤½Ä___1.3.2 ŽÇè(exploration)°ú ÀÌ¿ë(exploitation)¢Ã 2Àå: À¯´ÏƼ ML_Agents »ìÆ캸±â2.1 À¯´ÏƼ¿Í ML-Agents___2.1.1 À¯´ÏƼ___2.1.2 ML-Agents2.2 À¯´ÏƼ ¼³Ä¡ ¹× ±âÃÊ Á¶ÀÛ¹ý___2.2.1 À¯´ÏƼ Çãºê ´Ù¿î·Îµå ¹× ¼³Ä¡___2.2.2 À¯´ÏƼ ¶óÀ̼±½º È°¼ºÈ___2.2.3 À¯´ÏƼ ¿¡µðÅÍ ¼³Ä¡___2.2.4 À¯´ÏƼ ÇÁ·ÎÁ§Æ® »ý¼º___2.2.5 À¯´ÏƼ ÀÎÅÍÆäÀ̽º___2.2.6 À¯´ÏƼÀÇ ±âÃÊÀûÀÎ Á¶ÀÛ2.3 ML-Agents ¼³Ä¡___2.3.1 ML-Agents ÆÄÀÏ ³»·Á¹Þ±â___2.3.2 À¯´ÏƼ¿¡ ML-Agents ¼³Ä¡Çϱâ ___2.3.3 ML-Agents ÆÄÀ̽ã ÆÐÅ°Áö ¼³Ä¡Çϱâ2.4 ML-AgentsÀÇ ±¸¼º ¿ä¼Ò___2.4.1 Behavior Parameters___2.4.2 Agent Script___2.4.3 Decision Requester, Model Overrider___2.4.4 ȯ°æ ºôµåÇϱâ2.5 mlagents-learnÀ» ÀÌ¿ëÇØ ML-Agents »ç¿ëÇϱâ___2.5.1 ML-Agents¿¡¼ Á¦°øÇÏ´Â °ÈÇнÀ ¾Ë°í¸®Áò___2.5.2 ML-Agents¿¡¼ Á¦°øÇÏ´Â ÇнÀ ¹æ½Ä___2.5.3 PPO ¾Ë°í¸®ÁòÀ» ÀÌ¿ëÇÑ 3DBall ȯ°æ ÇнÀ2.6 Python-API¸¦ ÀÌ¿ëÇØ ML-Agents »ç¿ëÇϱâ ___2.6.1 Python-API¸¦ ÅëÇÑ ¿¡ÀÌÀüÆ® ·£´ý Á¦¾î¢Ã 3Àå: ±×¸®µå¿ùµå ȯ°æ ¸¸µé±â3.1 ÇÁ·ÎÁ§Æ® ½ÃÀÛÇϱâ3.2 ±×¸®µå¿ùµå ½ºÅ©¸³Æ® ¼³¸í3.3 º¤ÅÍ °üÃø Ãß°¡ ¹× ȯ°æ ºôµå3.4 ¹ø¿Ü: ÄÚµå ÃÖÀûÈ Çϱâ¢Ã 4Àå: Deep Q Network(DQN)4.1 DQN ¾Ë°í¸®ÁòÀÇ ¹è°æ___4.1.1 °¡Ä¡ ±â¹Ý °ÈÇнÀ___4.1.2 DQN ¾Ë°í¸®ÁòÀÇ °³¿ä4.2 DQN ¾Ë°í¸®ÁòÀÇ ±â¹ý___4.2.1 °æÇè ¸®Ç÷¹ÀÌ(experience replay)___4.2.2 Ÿ±ê ³×Æ®¿öÅ©(target network)4.3 DQN ÇнÀ4.4 DQN ÄÚµå___4.4.1 ¶óÀ̺귯¸® ºÒ·¯¿À±â ¹× ÆĶó¹ÌÅÍ °ª ¼³Á¤ ___4.4.2 Model Ŭ·¡½º___4.4.3 Agent Ŭ·¡½º___4.4.4 Main ÇÔ¼ö___4.4.5 ÇнÀ °á°ú¢Ã 5Àå: µå·Ð ȯ°æ ¸¸µé±â5.1 A2C ¾Ë°í¸®ÁòÀÇ °³¿ä5.2 ¾×ÅÍ-Å©¸®Æ½ ³×Æ®¿öÅ©ÀÇ ±¸Á¶5.3 A2C ¾Ë°í¸®ÁòÀÇ ÇнÀ °úÁ¤5.4 A2CÀÇ ÀüüÀûÀÎ ÇнÀ °úÁ¤5.5 A2C ÄÚµå___5.5.1 ¶óÀ̺귯¸® ºÒ·¯¿À±â ¹× ÆĶó¹ÌÅÍ °ª ¼³Á¤___5.5.2 Model Ŭ·¡½º___5.5.3 Agent Ŭ·¡½º___5.5.4 Main ÇÔ¼ö5.5.5 ÇнÀ °á°ú¢Ã 6Àå: Advantage Actor Critic(A2C)6.1 ÇÁ·ÎÁ§Æ® ½ÃÀÛÇϱâ6.2 µå·Ð ¿¡¼Â °¡Á®¿À±â & ¿ÀºêÁ§Æ® Ãß°¡___6.2.1 ¿¡¼Â½ºÅä¾î¿¡¼ µå·Ð ¿¡¼Â ³»·Á¹Þ±â___6.2.2 µå·Ð ȯ°æ Á¦ÀÛÇϱâ6.3 ½ºÅ©¸³Æ® ¼³¸í___6.3.1 DroneSetting ½ºÅ©¸³Æ®___6.3.2. DroneAgent ½ºÅ©¸³Æ®6.4 µå·Ð ȯ°æ ½ÇÇà ¹× È¯°æ ºôµå¢Ã 7Àå: Deep Deterministic Policy Gradient(DDPG)7.1 DDPG ¾Ë°í¸®ÁòÀÇ °³¿ä7.2 DDPG ¾Ë°í¸®ÁòÀÇ ±â¹ý___7.2.1 °æÇè ¸®Ç÷¹ÀÌ(experience replay)___7.2.2 Ÿ±ê ³×Æ®¿öÅ©(target network)___7.2.3 ¼ÒÇÁÆ® Ÿ±ê ¾÷µ¥ÀÌÆ®(soft target update)___7.2.4 OU ³ëÀÌÁî(Ornstein Uhlenbeck Noise)7.3 DDPG ÇнÀ___7.3.1 Å©¸®Æ½ ³×Æ®¿öÅ© ¾÷µ¥ÀÌÆ® ___7.3.2 ¾×ÅÍ ³×Æ®¿öÅ© ¾÷µ¥ÀÌÆ®7.4 DDPG ÄÚµå___7.4.1 ¶óÀ̺귯¸® ºÒ·¯¿À±â ¹× ÆĶó¹ÌÅÍ °ª ¼³Á¤ ___7.4.2 OU Noise Ŭ·¡½º___7.4.3 Actor Ŭ·¡½º___7.4.4 Critic Ŭ·¡½º___7.4.5 Agent Ŭ·¡½º___7.4.6 Main ÇÔ¼ö___7.4.7 ÇнÀ °á°ú¢Ã 8Àå: Ä«Æ®·¹ÀÌ½Ì È¯°æ ¸¸µé±â8.1 ÇÁ·ÎÁ§Æ® ½ÃÀÛÇϱâ8.2 Ä«Æ®·¹ÀÌ½Ì È¯°æ ±¸¼ºÇϱâ8.3 ½ºÅ©¸³Æ® ÀÛ¼º ¹× ºôµåÇϱâ¢Ã 9Àå: Behavioral Cloning(BC)9.1 Behavioral Cloning ¾Ë°í¸®ÁòÀÇ °³¿ä9.2 Behavioral Cloning ¾Ë°í¸®ÁòÀÇ ±â¹ý___9.2.1 º¸»óÀÌ À½¼öÀÎ µ¥ÀÌÅÍ Á¦¿ÜÇϱâ9.3 Behavioral Cloning ÇнÀ9.4 Behavioral Cloning ¾Ë°í¸®Áò ÄÚµå___9.4.1 ¶óÀ̺귯¸® ºÒ·¯¿À±â ¹× ÆĶó¹ÌÅÍ °ª ¼³Á¤___9.4.2 Model Ŭ·¡½º___9.4.3 Agent Ŭ·¡½º___9.4.4 Main ÇÔ¼ö___9.4.5 ÇнÀ °á°ú9.5 ml-agentsÀÇ ³»Àå Imitation Learning »ç¿ë___9.5.1 ML-Agents¿¡¼ Á¦°øÇÏ´Â Behavioral Cloning ¾Ë°í¸®Áò ___9.5.2 ML-Agents¿¡¼ Á¦°øÇÏ´Â GAIL ¾Ë°í¸®Áò___9.5.3 ¸ð¹æÇнÀÀ» À§ÇÑ Config ÆÄÀÏ ¼³Á¤___9.5.4 ml-agent¿¡¼ÀÇ ¸ð¹æÇнÀ °á°ú¢Ã 10Àå: ¸¶¹«¸®10.1 ±âÃÊÆí ³»¿ë Á¤¸®10.2 Ãß°¡ ÇнÀ ÀÚ·á___10.2.1 À¯´ÏƼ___10.2.2 À¯´ÏƼ ML-Agents___10.2.3 °ÈÇнÀ10.3 ÀÀ¿ëÆí¿¡¼ »ìÆ캼 ³»¿ë