Đăng ký Đăng nhập
Trang chủ Ứng dụng học sâu vào dịch từ vựng mà không cần dữ liệu song ngữ ...

Tài liệu Ứng dụng học sâu vào dịch từ vựng mà không cần dữ liệu song ngữ

.PDF
61
1
94

Mô tả:

ĈҤI HӐ&48Ӕ&GIA TP. HCM TRѬӠNG ĈҤI HӐC BÁCH KHOA --------------------------------------- 75Ҫ148Æ1 Ӭ1*'Ө1*+Ӑ&6Æ89¬2 'ӎ&+7Ӯ9Ӵ1* 0¬.+Ð1*&Ҫ1 'Ӳ/,ӊ8621*1*Ӳ Chuyên ngành: Khoa Hӑc Máy Tính Mã sӕ: 8.48.01.01 LUҰN VĂ17+Ҥ& SƬ 73+Ӗ&+Ë0,1+WKiQJQăP 2021 &Ð1*75Î1+ĈѬӦC HOÀN THÀNH TҤI: 75ѬӠ1*ĈҤI HӐC BÁCH KHOA ±Ĉ+4*-HCM Cán bӝ Kѭӟng dүn khoa hӑc: PGS.TS QuҧQ7KjQK7Kѫ Cán bӝ chҩm nhұn xét 1: TS. Võ Thӏ Ngӑc Châu Cán bӝ chҩm nhұn xét 2: PGS.TS NguyӉn TuҩQĈăQJ LuұQYăQWKҥFVƭÿѭӧc bҧo vӋ tҥL7UѭӡQJĈҥi hӑF%iFK.KRDĈ+4*7S+&0QJj\07 WKiQJQăP (trӵc tuyӃn). Thành phҫn HӝLÿӗQJÿiQKJLiOXұQYăQWKҥFVƭJӗm: 1. TS. NguyӉQĈӭF'NJQJ - Chӫ tӏch 2. TS. NguyӉn TiӃn Thӏnh - 7KѭNê 3. TS. Võ Thӏ Ngӑc Châu - GV Phҧn biӋn 1 4. PGS.TS NguyӉn TuҩQĈăQJ - GV Phҧn biӋn 2 5. PGS.TS HuǤnh Trung HiӃu - Ӫy viên ;iFQKұQFӫD&KӫWӏFK+ӝLÿӗQJÿiQKJLi/9Yj7UѭӣQJ.KRDTXҧQOêFKX\rQngành sau khi OXұQYăQÿmÿѭӧFVӱDFKӳD QӃXFy  CHӪ TӎCH HӜ,ĈӖNG 75ѬӢNG KHOA KHOA HӐC VÀ KӺ THUҰT MÁY TÍNH ĈҤ,+Ӑ&48Ӕ&*,$73+&0 &Ӝ1*+2¬;­+Ӝ,&+Ӫ1*+Ƭ$9,ӊ7 NAM 75ѬӠ1*ĈҤ,+Ӑ&%È&+.+2$ ĈӝFOұS± 7ӵGR± +ҥQKSK~F NHIӊM VӨ LUҰ19Ă17+Ҥ&6Ƭ +ӑWrQKӑFYLrQ7UҫQ4XkQ 1Jj\WKiQJQăPVLQK7/05/1990 &KX\rQQJjQK.KRDKӑFmáy tính MSHV: 1870575 1ѫLVLQK4XҧQJ1JmL 0mVӕ8.48.01.01 ,7Ç1Ĉӄ7¬, ӬQJGөQJKӑFVkXWURQJGӏFKWӯYӵQJPjNK{QJFҫQGӳOLӋXVRQJQJӳ/ Applying deep learning to word translation without parallel data. II. 1+,ӊ0 9Ө9¬1Ӝ,'81* 7KӵFKLӋQYLӋF[k\GӵQJPӝWEӝWӯÿLӇQVRQJQJӳWӯ WұSWjLOLӋXÿѫQQJӳÿӝFOұSYӟLQKDX'QJP{KuQKKӑFVkXWӵVLQKÿӕLNKiQJ *$1 NӃW KӧSYӟLJLҧLTX\ӃW3URFUXVWHVÿӇFyWKӇKLӋQWKӵFEjLWRiQ ,,,1*¬<*,$21+,ӊ09Ө 21/03/2021 ,91*¬<+2¬17+¬1+1+,ӊ09Ө 31/05/2021 9&È1%Ӝ+ѬӞ1*'Ү1 3*6764XҧQ7KjQK7Kѫ &È1%Ӝ+ѬӞ1*'Ү1 +ӑWrQYjFKӳNê 7S+͛&Kt0LQKQJj\31 tháng 05 QăP &+Ӫ1+,ӊ0%Ӝ0Ð1Ĉ¬27Ҥ2 +ӑWrQYjFKӳNê 75ѬӢ1*.+2$.+2$+Ӑ& .Ӻ7+8Ұ70È<7Ë1+ +ӑWrQYjFKӳNê i LӠI CҦM Ѫ1 ĈҫXWLrQW{L[LQÿѭӧFEj\WӓOzQJELӃWѫQVkXVҳFWӟL3*6764XҧQ7KjQK7Kѫ QJѭӡLÿmKѭӟQJGүQW{LWURQJVXӕWTXiWUuQKWKӵFKLӋQOXұQYăQFNJQJQKѭÿӅFѭѫQJ. Nhӡ cy QKӳQJ FKӍGүQYjJySêcӫa thҫy mj tôi mӟi cy thӇ hojn thjnh tӕWÿѭӧc ÿӅWjLOXұQYăn njy. 7{L[LQÿѭӧFJӱLOӡLFҧPѫQÿӃQTXêWKҫ\F{NKRD.KRDKӑFYj.ӻWKXұWPi\WtQKÿm WUX\ӅQWKөQKӳQJNLӃQWKӭFNLQKQJKLӋPTXêEiXFKRW{LWURQJKѫQKDLQăPTXD;LQJӱL OӡLWULkQÿӃQWҩWFҧFiFWKjQKYLrQWURQJQKyP/DQJXDJH0RGHOFӫDWKҫ\7KѫYuQKӳQJVӵ JL~SÿӥYjKӛWUӧWURQJVXӕWTXiWUuQKWKӵFKLӋQOXұQYăQ &XӕLFQJW{L[LQJӱLOӡLFҧP ѫQFKkQWKjQKÿӃQJLDÿuQKYjEҥQEqQKӳQJQJѭӡLÿmOX{QÿӝQJYLrQӫQJKӝW{LWURQJ VXӕWWKӡLJLDQKӑF&DRKӑF 7S+͛&Kt0LQKQJj\31 tháng 05 QăP ii TÓM TҲT LUҰ19Ă1 &iFJLҧLSKiS[k\GӵQJEӝWӯÿLӇQVRQJQJӳWӵÿӝQJKLӋQQD\WKѭӡQJSKҧLGӵDWUrQ FiFWұSGӳOLӋXVRQJQJӳÿӇKXҩQOX\ӋQ0ӝWVӕQJKLrQFӭXJҫQÿk\FKRWKҩ\ FyWKӇkhông FҫQSKҧLVӱGөQJGӳOLӋXFRUSXVVRQJQJӳFKRYLӋFKXҩQOX\ӋQ, bҵQJFiFKKXҩQOX\ӋQ mô KuQKKӑFVkXÿӇWҥRUDEӝWKDPVӕGQJFKRYLӋFiQK[ҥWӯNK{QJJLDQFӫDQJ{QQJӳ QJXӗQVDQJNK{QJJLDQQJ{QQJӳÿtFKFiFKKRjQWRjQWӵÿӝQJ0ӝWFiFKKLӇXNKiFP{ KuQKVӁWuPFiFKFăQFKӍQKSKkQEӕFӫDNK{QJJLDQQJ{QQJӳQJXӗQNKӟSYӟLSKkQEӕFӫD NK{QJJLDQQJ{QQJӳÿtFKYjEӝWKDPVӕFӫDP{KuQKVӁWUӣWKjQKPDWUұQiQK[ҥJLӳD QJ{QQJӳ/XұQYăQVӁFyKѭӟQJWLӃSFұQVӱGөQJPҥQJWӵVLQKÿӕLNKiQJ *$1 NӃW KӧSYӟLYLӋFJLҧLTX\ӃW YҩQÿӅWUӵFJLDR3URFUXVWHVÿӇ[k\GӵQJP{KuQKQKѭYұ\7ұSGӳ OLӋXVӱGөQJFKROXұQYăQOjFiFWұSFRUSXVÿѫQQJӳFӫDWLӃQJ$QKWLӃQJ3KiSYj7LӃQJ 9LӋWWӯZLNLSHGLD&iFEӝ:RUG(PEHGGLQJVӱGөQJFKRYLӋFKXҩQOX\ӋQOj:RUG9HFYj )DVW7H[W7ӯÿyFyQKӳQJVӵÿiQKJLiQKLӅXJyFÿӝWӯFKtQKEӝWӯÿLӇQVLQKUDÿѭӧFWӯ mô hình. iii ABSTRACT The state-of-the-art methods for learning cross-lingual word embeddings have relied on parallel corpora. Recent studies showed that the need for parallel data supervision can be alleviated. In this work, it shows that we can build a bilingual dictionary between two languages without using any parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way. Hence, I applied a Generative Adversarial Network (GAN) and solving orthogonal Procrustes problem to implement these solutions. The dataset which used for this thesis is the monolingual corpora of English, French and Vietnamese and they are collected from Wikipedia. The Word Embedding which used for training are Word2Vec and FastText. Finally, I also present the evaluation about the dictionary which generated from these models. iv LӠ,&$0Ĉ2$1 7{L[LQFDPÿRDQOXұQYăQ³ӬQJGөQJKӑFVkXYjRGӏFKWӯYӵQJPjNK{QJFҫQGӳ OLӋXVRQJQJӳ´OjNӃWTXҧQJKLrQFӭXFӫDW{LGѭӟLVӵKѭӟQJGүQYjJySêFӫD3*676 4XҧQ7KjQK7Kѫ1KӳQJWK{QJWLQWKDPNKҧRWӯFiFF{QJWUuQKNKiFFyOLrQTXDQÿӅXÿm ÿѭӧFJKLU}WURQJOXұQYăQ1ӝLGXQJQJKLrQFӭXYjFiFNӃWTXҧÿӅXOjGRFKtQKW{LWKӵF KLӋQNK{QJVDRFKpSKD\Oҩ\WӯPӝWQJXӗQQjRNKiF7{L[LQFKӏXWRjQEӝWUiFKQKLӋPYӅ OӡLFDPÿRDQQj\ 7KjQKSK͙+͛&Kt0LQKQJj\31 tháng 06 QăP +ӑF9LrQ 7UҫQ4XkQ v MӨC LӨC NHIӊM VӨ LUҰ19Ă17+Ҥ&6Ƭ............................................................................................................................. I /Ӡ,&Ҧ0Ѫ1 .............................................................................................................................................................. II 7Ï07Ҳ7/8Ұ19Ă1 .............................................................................................................................................III ABSTRACT ................................................................................................................................................................ IV /Ӡ,&$0Ĉ2$1 ........................................................................................................................................................ V 0Ө&/Ө& ................................................................................................................................................................... VI '$1+0Ө&+Î1+9Ӏ .......................................................................................................................................... VIII '$1+0Ө&%Ҧ1* ................................................................................................................................................... IX '$1+0Ө&0­&+ѬѪ1*75Î1+ ...................................................................................................................... IX '$1+0Ө& &+Ӳ9,ӂ77Ҳ7 ................................................................................................................................. IX 1 *,Ӟ,7+,ӊ8 ....................................................................................................................................................... 1 1.1 1.2 1.3 1.4 2 TӘ1*48$1 ................................................................................................................................................. 1 TË1+Ӭ1*'Ө1*&Ӫ$Ĉӄ7¬, ........................................................................................................................ 2 MӨ&7,Ç89¬*,Ӟ,+Ҥ1&Ӫ$Ĉӄ7¬, ............................................................................................................. 2 CҨ875Ò&&Ӫ$/8Ұ19Ă1 ........................................................................................................................... 2 CÁC CÔNG TRÌNH LIÊN QUAN ................................................................................................................... 4 T. MIKOLOV, L.V. QUOC, AND I. SUTSKEVER, ³EXPLOITING SIMILARITIES AMONG LANGUAGES FOR MACHINE TRANSLATION´ ARXIV PREPRINT ARXIV:1309.4168, 2013B. [1] ................................................................ 4 2.2 C. XING, D. WANG, C. LIU, AND Y. LIN, ³1ORMALIZED WORD EMBEDDING AND ORTHOGONAL TRANSFORM FOR BILINGUAL WORD TRANSLATION´ PROCEEDINGS OF NAACL, 2015 [2] ......................................... 4 2.3 W. AMMAR, G. MULCAIRE, Y. TSVETKOV, G. LAMPLE, C. DYER, A. SMITH, ³0ASSIVELY MULTILINGUAL WORD EMBEDDINGS´ ARXIV PREPRINT ARXIV: 1602.01925, 2016 [3] ....................................................................... 4 2.4 A. CONNEAU, G. LAMPLE, M. RANZATO, L. DENOYER, H. JÉGOU, ³:ORD TRANSLATION WITHOUT PARALLEL DATA´ ARXIV PREPRINT ARXIV: 1710.04087, 2018 [4] ........................................................................... 5 2.1 3 &Ѫ6Ӣ/é7+8<ӂ7 ......................................................................................................................................... 6 3.1 3.1.1 3.1.2 3.1.3 3.1.4 3.2 3.2.1 3.2.2 3.2.3 3.3 3.3.1 3.3.2 3.3.3 3.4 3.4.1 3.4.2 3.4.3 3.4.4 MҤ1*1Ѫ5211+Æ17Ҥ2(ARTIFICIAL NEURAL NETWORK - ANN ............................................................. 6 *LͣLWKL͏X................................................................................................................................................ 6 &iFKjPNtFKKR̩W.................................................................................................................................. 7 Hàm FKLSKtP̭WPiW .............................................................................................................................. 9 &iFNͿWKX̵W[͵OêYͣLP̩QJQ˯URQ ..................................................................................................... 10 MÔ HÌNH WORD EMBEDDING .................................................................................................................... 12 9pFW˯2QH-hot ..................................................................................................................................... 12 Mô hình Word2Vec .............................................................................................................................. 13 Mô hình FastText ................................................................................................................................. 17 VҨ1Ĉӄ75Ӵ&*,$2PROCRUSTES .............................................................................................................. 18 +͏WU͹FJLDR ......................................................................................................................................... 18 3K˱˯QJSháp phân tích Singular Value Decomposition (SVD) ........................................................... 19 9̭Qÿ͉WU͹FJLDR3URFUXVWHVYjFiFKJL̫LTX\͇W .................................................................................. 19 MҤ1*7Ӵ6,1+ĈӔ,.+È1*(GAN)............................................................................................................ 22 *LͣLWKL͏XY͉*$1 ................................................................................................................................ 22 .L͇QWU~FFͯD*$1 ............................................................................................................................... 22 +jPW͙L˱XFͯDP̩QJ*$1 .................................................................................................................. 23 4XiWUuQKKR̩Wÿ͡QJNKLKX̭QOX\͏Q*$1 ........................................................................................... 24 vi 4 3+ѬѪ1*3+È37+Ӵ&+,ӊ1 ...................................................................................................................... 27 4.1 4.1.1 4.1.2 4.2 4.2.1 4.2.2 4.3 4.3.1 4.3.2 4.3.3 4.4 4.4.1 4.4.2 4.4.3 4.5 4.6 4.7 5 P+ѬѪ1*3+È3;Ӱ/é'Ӳ/,ӊ8 ................................................................................................................... 27 1JX͛QGͷOL͏X ....................................................................................................................................... 27 7͝QJKͫS GͷOL͏XYj[͵OêGͷOL͏X ........................................................................................................ 27 P+ѬѪ1*3+È3;Æ<'Ӵ1*WORD EMBEDDING .......................................................................................... 27 +X̭QOX\͏QZRUGYHFFKRFiFW̵SFRUSXVÿ˯QQJͷ ............................................................................. 27 +X̭QOX\͏QIDVWWH[WFKRW̵SFRUSXVÿ˯QQJͷ........................................................................................ 27 P+ѬѪ1*3+È3;Æ<'Ӵ1*0Ð+Î1+0Ҥ1*7Ӵ6,1+ĈӔ,.+È1* ................................................................ 28 .L͇QWU~FFͯDP{KuQK .......................................................................................................................... 28 Ĉ͡ÿRNKR̫QJFiFKJLͷDWͳÿ˱ͫFVLQKUDYjWͳFͯDQJ{QQJͷÿtFK ..................................................... 29 +X̭QOX\͏QP{KuQK ............................................................................................................................. 30 P+ѬѪ1*3+È3&Ҧ,7+,ӊ1+,ӊ848Ҧ&Ӫ$9,ӊ&+8Ҩ1/8<ӊ10Ð+Î1+GAN ............................................. 30 &̵SQK̵W/HDUQLQJ5DWHTXDWͳQJHSRFK ............................................................................................. 30 6͵GͭQJ6PRRWKLQJ/DEHO .................................................................................................................... 31 7U͹FJLDRKyDPDWU̵Q ......................................................................................................................... 31 P+ѬѪ1*3+È3*,Ҧ,48<ӂ79Ҩ1Ĉӄ75Ӵ&*,$2PROCRUSTES .................................................................... 31 P+ѬѪ1*3+È36,1+7ӮĈ,ӆ1...................................................................................................................... 31 P+ѬѪ1*3+È3ĈÈ1+*,È ........................................................................................................................... 31 +,ӊ17+Ӵ&9¬ĈÈ1+*,È ......................................................................................................................... 32 5.1 5.2 5.3 5.4 5.4.1 5.4.2 5.4.3 5.4.4 5.4.5 5.5 5.6 5.6.1 5.6.2 5.6.3 5.7 5.7.1 5.7.2 5.7.3 5.7.4 5.7.5 5.7.6 5.8 MÔ HÌNH .................................................................................................................................................... 32 TӘ1*48$19ӄ&È&%ѬӞ&;Æ<'Ӵ1*0Ð+Î1+ ........................................................................................ 33 XӰ/é'Ӳ/,ӊ89¬+8Ҩ1/8<ӊ1WORD EMBEDDING................................................................................ 36 H,ӊ17+Ӵ&0Ð+Î1+GAN 9¬&È&.Ӻ7+8Ұ7&Ҧ,7+,ӊ17521*48È75Î1++8Ҩ1/8<ӊ10Ð+Î1+ ....... 37 1J{QQJͷYjWK˱YL͏Q ........................................................................................................................... 37 0̩QJ'LVFULPLQDWRU ............................................................................................................................. 37 Xâ\G͹QJP{KuQK0DSSHU ................................................................................................................... 37 &iFWKDPV͙WKDPJLDTXiWUuQKKX̭QOX\͏Q*$1 ............................................................................... 38 &KL͇QO˱ͫFKX̭QOX\͏Q*$1 ................................................................................................................ 38 H,ӊ17+Ӵ&*,Ҧ,48<ӂ79Ҩ1Ĉӄ75Ӵ&*,$2PROCRUSTES ......................................................................... 38 H,ӊ17+Ӵ&%Ӝ7ӮĈ,ӆ1 .............................................................................................................................. 39 .͇WTX̫E͡WͳÿL͋Q$QK- 9L͏WVLQKUDWͳ:RUGYHF ............................................................................. 39 .͇WTX̫E͡WͳÿL͋Q$QK- 3KiSVLQKUDWͳ:RUGYHF ........................................................................... 41 .͇WTX̫E͡WͳÿL͋Q$QK- 3KiSVLQKUDWͳ)DVWWH[W .............................................................................. 43 Kӂ748Ҧ&Ӫ$0Ð+Î1+9¬&È&1+Ұ1;e7 ................................................................................................ 45 ĈiQKJLiP{KuQK ................................................................................................................................. 45 ̪QKK˱ͧQJFͯDFK̭WO˱ͫQJFRUSXVYjNtFKWK˱ͣFWͳY͹QJÿ͇QN͇WTX̫ ................................................ 46 ̪QKK˱ͧQJFͯDFK̭WO˱ͫQJFRUSXVYjNtFKWK˱ͣFWͳY͹QJÿ͇QN͇WTX̫ ................................................ 46 ̪QKK˱ͧQJFͯDF̭XWU~FWͳY͹QJFͯDQJ{QQJͷÿ͇QN͇WTX̫ ............................................................ 46 ̪QKK˱ͧQJFͯDFiFOR̩L:RUG(PEHGGLQJÿ͇QN͇WTX̫ ....................................................................... 46 ̪QKK˱ͧQJFͯDWtQKWRiQ3URFUXVWHVÿ͇QN͇WTX̫ ................................................................................ 47 HѬӞ1*0Ӣ5Ӝ1*&Ӫ$Ĉӄ7¬, .................................................................................................................... 47 7¬,/,ӊ87+$0.+Ҧ2 ......................................................................................................................................... 48 vii DANH MӨC HÌNH VӀ Hunh 1: Minh hӑa quá trình ánh xҥ giӳDNK{QJJLDQYHFWѫFӫa 2 ngôn ngӳ .................................. 1 Hunh 2: Hình minh hӑa 1 mҥQJQѫURQQKLӅu lӟp ........................................................................... 6 Hunh 3Ĉӗ thӏ hàm tanh.................................................................................................................. 7 Hunh 4Ĉӗ thӏ hàm Sigmoid ........................................................................................................... 8 Hunh 5: Ĉӗ thӏ hàm ReLU ............................................................................................................... 8 Hunh 6Ĉӗ thӏ hàm Leaky ReLU .................................................................................................... 9 Hunh 7: Minh hӑa kӻ thuұt dropout............................................................................................... 11 Hunh 8: minh hӑa vӅ các tә chӭc one-hot vector. ......................................................................... 12 Hunh 9: Hình minh hӑa thӇ hiӋn sӵ liên quan vӅ ngӳ QJKƭDWURQJZRUGYHF ............................... 13 Hunh 10: Hình minh hӑa kiӃn trúc cӫa mô hình word2vec ........................................................... 14 Hunh 11: Hình minh hӑa mô hình CBOW .................................................................................... 15 Hunh 12: Minh hӑa kiӃn trúc mҥQJQѫURQFӫa mô hình Skip-gram ............................................. 16 Hunh 13: Minh hӑa vӅ vҩQÿӅ vӅ Out of vocabulary cӫa word2vec ............................................. 17 Hunh 14: Minh hӑa vӅ phân bӕ cӫa tұp hӧp B .............................................................................. 20 Hunh 15: Minh hӑa vӅ phân bӕ cӫa tұp hӧp A .............................................................................. 20 Hunh 16: KӃt quҧ cӫa viӋFFăQFKӍnh 2 phân bӕ RA và B ............................................................ 22 Hunh 17: Hình minh hӑa kiӃn trúc cӫa GAN ................................................................................ 23 Hunh 18: Hình minh hӑa 2 phân bӕ EDQÿҫu hoàn toàn cách biӋt nhau ........................................ 25 Hunh 19: Mҥng Discriminative có nhiӋm vө phân biӋt 2 phân bӕ ................................................ 25 Hunh 20%DQÿҫu, mҥng Discriminative dӉ dàng phân biӋt 2 phân bӕ ........................................ 25 Hunh 21: Hình minh hӑa quá trình cұp nhұt lҥi trӑng sӕ cӫD*HQDUDWLYH0RGHOÿӇ tҥo ra phân bӕ mӟi tӕWKѫQ .................................................................................................................................... 26 Hunh 22: Discriminative Model vүn còn phát hiӋn ra sӵ khác biӋt cӫa 2 phân bӕ, vì thӃ tiӃp tөc lan truyӅn lҥL*HQDUDWLYHÿӇ cұp nhұt tiӃp trӑng sӕ ....................................................................... 26 Hunh 23: Hình minh hӑa viӋc các mô hình dӯng lҥi khi 2 phân bӕ ÿmNKӟp nhau ....................... 26 Hunh 24: Hình minh hӑa vӅ xoay phân bӕ X bҵng ma trұQ:ÿӇ khӟp vӟi phân bӕ Y ................ 28 Hunh 25 Minh hӑa các thành phҫn và luӗng hoҥWÿӝng cӫDP{KuQKGQJWURQJÿӅ tài ............... 32 Hunh 26: Minh hӑa quá trình xӱ lý dataset ................................................................................... 33 Hunh 27: Mô hình Discriminator phân biӋt phân bӕ thұt giҧ ........................................................ 33 Hunh 28: Minh hӑa hoҥWÿӝng cӫa mô hình Mapper ..................................................................... 34 Hunh 29: Minh hӑa cách xây dӵng hàm loss cho mô hình............................................................ 34 Hunh 30: Minh hӑa quá trình tӕLѭX:Eҵng giҧi quyӃt Procrustes .............................................. 35 Hunh 31: Minh hӑa chi tiӃt quá trình hoҥWÿӝng cӫa mô hình ....................................................... 35 Hunh 32: Minh hӑa quá trình sinh tӯ ÿLӇn ..................................................................................... 36 Hunh 33: Giá trӏ loss cӫa mô hình GAN sau 25 epochs ................................................................ 45 Hunh 34: So sánh kӃt quҧ cӫa các tӯ ÿLӇn khác nhau .................................................................... 46 viii DANH MӨC BҦNG Bҧng 1: Minh hӑa quá trình dùng tӯ [XQJTXDQK FRQWH[WZRUGV ÿӇ dӵ ÿRiQWӯ ӣ giӳa (center word) cӫa CBOW ......................................................................................................................... 14 Bҧng 2: Bҧng minh hӑa quá trình dùng tӯ ӣ giӳD FRQWH[WZRUGV ÿӇ dӵ ÿRiQWӯ các tӯ xung quanh cӫa skip-gram ..................................................................................................................... 16 Bҧng 3: Minh quá quá trình tách các sub-words cӫa FastText ..................................................... 18 Bҧng 4: Minh hӑa quá trình huҩn luyӋn cӫa FastText .................................................................. 18 Bҧng 5: Tӯ ÿLӇn Anh - ViӋt .......................................................................................................... 39 Bҧng 6: Tӯ ÿLӇn Anh - Pháp (Word2vec) ..................................................................................... 41 Bҧng 7: Tӯ ÿLӇn Anh - Pháp (FastText) ........................................................................................ 43 DANH MӨ&0­&+ѬѪ1*TRÌNH 0mFKѭѫQJWUuQK: Decode dӳ liӋu wikipedia ..................................................................36 0mFKѭѫQJWUuQK: Xây dӵng mô hình word2vec và fasttext ...........................................36 0mFKѭѫQJWUuQK: Xây dӵng mô hình Discriminator ......................................................37 0mFKѭѫQJWUuQK: Xây dӵng mô hình Mapper ................................................................37 0mFKѭѫQJWUuQK: HiӋn thӵc tính toán Procrustes ...........................................................38 DANH MӨC CHӲ VIӂT TҲT ANN DL ML G D M GAN SVD Artificial Neural Network Deep Learning Machine Learning Genarative Model Discriminator Model Mapper Model Generative Adversarial Networks Singular Value Decomposition ix 1 GIӞI THIӊU 1.1 Tәng quan ӢÿӅFѭѫQJQj\W{LVӁÿӅ[XҩW[k\GӵQJKӋWKӕQJVLQKWӯÿLӇQWӵÿӝQJQKѭQJNK{QJ FҫQVӱGөQJFRUSXVVRQJQJӳ%ҵQJFiFKWUtFK[XҩWÿһFWUѭQJQJ{QQJӳW{LVӁWLӃQKjQK WҥRUDNK{QJJLDQvec-Wѫ FӫD WӯYӵQJ Wӯ PӛLORҥLQJ{QQJӳVDXÿy[k\GӵQJÿѭӧFP{ hình giúp iQK[ҥNK{QJJLDQ vec-WѫFӫD QJ{QQJӳQJXӗQVDQJQJ{QQJӳÿtch. Lúc này, FiFWӯYӵQJFӫDQJ{QQJӳQJXӗQVӁÿѭӧFiQK[ҥVDQJFiFWӯYӵQJFӫDQJ{QQJӳÿtFK WѭѫQJÿѭѫQJ6DXÿk\OjKuQKP{WҧP{KuQKPjW{LGӵNLӃQ[k\GӵQJ Hunh 10LQKK͕DTXiWUuQKiQK[̩JLͷDNK{QJJLDQYHFW˯FͯDQJ{QQJͷ +uQKPLQKKӑDFiFKPjP{KuQKFӫDÿӅWjLVӁWKӵFKLӋQ - ĈҫXWLrQFKRSKkQEӕFiFWӯYӵQJWURQJNK{QJJLDQQJ{QQJӳOjWLӃQJ$QK PjX ÿӓ Yj7LӃQJ9LӋW PjXWtP  - 1KLӋPYөFӫDP{KuQKOjWuPFiFKELӃQÿәLSKkQEӕPjXÿӓEҵQJFiFKQKkQYӟL mDWUұQWUӵFJLDR W WҥRUDSKpS[RD\VDRFKRNKӟSYӟLSKkQEӕPjXWtP - 6DXÿyÿRNKRҧQJFiFKJLӳDFiFWӯFӫDSKkQEӕVDXNKL[RD\ÿӇWuPUDQKӳQJFһS WӯQjRJҫQQKDXQKҩWO~FÿyFiFWӯQKѭ³FDW´FӫDWLӃQJ$QKVӁWUQJNKӟSYӟLWӯ WѭѫQJӭQJFӫDWLӃQJ9LӋWOj³FRQBPqR´ .ӃWTXҧFӫDGӵiQVӁJL~StFKFKRYLӋF[k\GӵQJWӯÿLӇQVRQJQJӳFiFKWӵÿӝQJ PjNK{QJFҫQWұSGӳOLӋXFRUSXVVRQJQJӳQjR&iFKWLӃS FұQQj\JL~SFKRYLӋFGӏFK WKXұWJLӳDFiFQJ{Q QJӳtWSKәELӃQ QKѭWLӃQJGkQWӝFWKLӇXVӕ ÿѭӧFGӉGjQJKѫQ 1JRjLUDEӝWӯÿLӇQQj\FzQKӛWUӧFKRPӝWVӕF{QJÿRҥQKXҩQOX\ӋQFiFP{KuQKGӏFK máy. 1 1.2 Tính ӭng dөng cӫDÿӅ tài 7ӯJLҧLSKiSGӏFKWӯYӵQJJLӳDQJ{QQJӳPjNK{QJFҫQGӳOLӋXVRQJQJӳÿӅWjLVӁ KѭӟQJÿӃQYLӋFWҥRUDEӝWӯÿLӇQVRQJQJӳPӝWFiFKWӵÿӝQJĈӕLYӟLQKӳQJQJ{QQJӳtW SKәELӃQQKѭWLӃQJGkQWӝFWKLӇXVӕÿӅWjLQj\FjQJFyQKLӅXêQJKƭD9LӋFVLQKUDWӯÿLӇQ QKѭYұ\KӛWUӧUҩWQKLӅXFKRQKӳQJFiQEӝF{QJWiFÿӃQYQJVkXYQg xa mà không có WjLOLӋXWӯÿLӇQÿӇWKDPNKҧR 1JRjLUDYLӋFVLQKFiFWӯÿLӇQWӵÿӝQJQKѭYұ\FNJQJJL~StFKFKRYLӋFKXҩQOX\ӋQFiF P{KuQKGӏFKPi\%ҵQJFiFKVLQKUDFiFFһSWӯFQJêQJKƭDFiFFһSWӯQj\FyWKӇGQJ ÿӇOjPJLjXGӳOLӋXÿӇKXҩQOX\ӋQFKRFiFP{KuQKGӏFKPi\FKҷQKҥQQKѭOjPJLjXGӳ OLӋXEҵQJFiFKWKD\WKӃGDWDVHWYӟLFiFWӯWURQJWӯÿLӇQKRһF7HDFKHU)RUFLQJFKRFiFFһS WӯOҩ\UDWӯEӝWӯÿLӇQ 1.3 Mөc tiêu và giӟi hҥn cӫDÿӅ tài 0өFWLrXFӫDÿӅWjLQj\EDRJӗP - 7uPNLӃPYj[ӱOêFiFWұSFRUSXVGӳOLӋXWӯQKLӅXQJXӗQNKiFQKDX&iFWұS FRUSXVVӱGөQJWURQJÿӅWjLEDRJӗPZLNLSHGLD7LӃQJ$QKWLӃQJ9LӋWWLӃQJ3KiS baomoi.com. - 1JKLrQFӭXFiFP{KuQKiQK[ҥQJ{QQJӳPjNK{QJFҫQGQJGӳOLӋXVRQJ QJӳĈӅWjLÿӅ[XҩWVӱGөQJNӃWKӧSmô hình là ³PҥQJWӵVLQKÿӕLNKiQJ *$1 ´ NӃWKӧSYӟL³JLҧLTX\ӃWYҩQÿӅ3URFUXVWHV´ÿӇWҥRUDÿѭӧFPDWUұQiQK[ҥQJ{Q QJӳFyKLӋXTXҧFDRQKҩW - +XҩQOX\ӋQFiFP{KuQKYjWӕLѭXP{KuQKWӕLWKLӇXKyDKjPPҩWPiWYӟLQKLӅX NӻWKXұWQKѭGQJVPRRWKLQJFұS QKұWOHDUQLQJUDWHWKHRTX\OXұW« - 6LQKUDEӝWӯÿLӇQÿѭӧFYjÿiQKJLiFKҩWOѭӧQJFӫDQy %ӝWӯÿLӇQKRjQWRjQ ÿѭӧFVLQKUDFiFKWӵÿӝQJPjNK{QJFҫQGQJEҩWNǤGӳOLӋXVRQJQJӳQjRÿӇ KXҩQOX\ӋQĈӝFKtQK[iFFӫDEӝWӯÿLӇQVӁÿѭӧFÿiQKJLiEҵQJFiFKVRViQKWUӵF WLӃSYӟLEӝWӯÿLӇQWKұW - ĈѭDUDNӃWOXұQYjKѭӟQJSKiWWULӇQWLӃSWKHRFӫDÿӅWjLWURQJWѭѫQJODLĈӅWjL WKXÿѭӧFPӝWVӕNӃWTXҧNKҧTXDQNKLEӝWӯÿLӇQVLQKUDFyÿӝFKtQK[iFNKiFDR 'ӵDWUrQQKӳQJNӃWTXҧNKҧTXDQQKѭYұ\ÿӅWjLFNJQJVӁ ÿӅ[XҩWUDQKӳQJKѭӟQJ ÿLWURQJWѭѫQJODL 1.4 Cҩu trúc cӫa luұQYăQ &KѭѫQJ7әQJTXDQYӅQӝLGXQJPөFWLrXYjFҩXWU~FOXұQYăQ 2 &KѭѫQJ.LӃQWKӭFQӅQWҧQJFyOLrQTXDQÿӃQÿӅWjLQKѭ:RUG(PEHGGLQJPҥQJQѫURQYҩQÿӅWUӵFJLDR3URFUXVWHVPҥQJ*$1s .. &KѭѫQJ&iFF{QJWUuQKQJKLrQFӭXFyOLrQTXDQÿӃQÿӅWjL &KѭѫQJ7UuQKEj\FiFSKѭѫQJSKiSVӱGөQJNKLKLӋQ WKӵF OXұQYăQ &KѭѫQJ0{WҧWKӵF WӃYLӋF KӋWKӕQJ YjÿiQKJLiNӃWTXҧ. &KѭѫQJ67әQJNӃWOҥLQKӳQJNӃWTXҧÿmÿҥWÿѭӧFYjÿӏQKKѭӟQJWURQJWѭѫQJODL 3 2 CÁC CÔNG TRÌNH LIÊN QUAN 2.1 T. Mikolov, L.V. Quoc, and I. Sutskever, ³Exploiting similarities among languages for machine translation´ arXiv preprint arXiv:1309.4168, 2013b. [1] 7URQJF{QJWUuQKQj\0RNRORYYjFӝQJVӵTXDQViWUҵQJZRUGHPEHGGLQJFNJQJFy SKkQEӕJLӕQJQKDXWUrQFҧQKLӅXQJ{QQJӳQJD\FҧQKӳQJQJ{QQJӳWӯQKӳQJYăQKyD NKiFQKDXQKѭWLӃQJ$QKWLӃQJ9LӋW+ӑFNJQJÿӅ[XҩWUҵQJEҵQJFiFKiQK[ҥJLӳDEӝ ZRUGHPEHGGLQJQj\FyWKӇSKөFYөYLӋFGӏFKVRQJQJӳ%ҵQJFiFKÿѭDUDWӯYӵQJ FKRPӛLQJ{QQJӳÿӇOjPFiFÿLӇPQHRVDXÿy[RD\PDWUұQiQK[ҥYLӋFiQK[ҥJLӳD QJ{QQJӳQj\YүQGӵDYjREӝWӯÿLӇQVRQJQJӳÿӇFӕÿӏQKFiFWӯWѭѫQJÿѭѫQJQKDXYjiQK [ҥTXDQKDX&iFKWLӃSFұQQj\YүQSKҧLGQJFiFGӳOLӋXVRQJQJӳÿӇFӕÿӏQK1ӃXFiF QJ{QQJӳtWSKәELӃQWKLӃXFiFGDWDVHWVRQJQJӳWKuP{KuQKQj\FNJQJNKyWKӵFKLӋQÿѭӧF 2.2 C. Xing, D. Wang, C. Liu, and Y. /LQ ³1RUPDOL]ed word embedding and RUWKRJRQDOWUDQVIRUPIRUELOLQJXDOZRUGWUDQVODWLRQ´Proceedings of NAACL, 2015 [2] &{QJWUuQKQj\ÿѭDUDPӝWJLҧLSKiSÿӇFKXҭQKyDFiFYHFWRUWӯYjFiFKELӃQÿәL WX\ӃQWtQKJLӳD:RUG(PEHGGLQJWK{QJTXDPDWUұQWUӵFJLDR&KDR;LQJYjFiF FӝQJVӵÿmpSWҩWFҧFiFEѭӟFFұSQKұWPDWUұQiQK[ҥQJ{QQJӳYӅPӝWPDWUұQWUӵFJLDR 0өFWLrXOjÿҧPEҧRFiFSKpSELӃQÿәLYHFWRUSKҧLFKӍOjPӝWSKpSTXD\ KRһFSKҧQ[ҥ  PjWK{L*LҧLSKiSQj\ÿmPDQJOҥLFiFNӃWTXҧҩQWѭӧQJNKLWKӵFKLӋQYLӋFGӏFKFiF NK{QJJLDQWӯYӵQJWӯ7LӃQJ$QKVDQJWLӃQJ7k\%DQ1KD7{LFNJQJVӱGөQJJLҧLSKiS Qj\WURQJYLӋFFKXҭQKyDPDWUұQiQK[ҥ:ÿӇFyWKӇJL~SP{KuQKWҥRUDNӃWTXҧWӕW QKҩW 2.3 W. Ammar, G. Mulcaire, Y. Tsvetkov, G. Lample, C. Dyer, A. Smith, ³0DVVLYHO\ PXOWLOLQJXDO ZRUG HPEHGGLQJV´ arXiv 1602.01925, 2016 [3] preprint arXiv: &{QJWUuQKQJKLrQFӭXQj\ÿѭDUDJLҧLSKiSÿӇFyWKӇWҥRUDPӝWZRUGHPEHGGLQJ FKXQJÿҥLGLӋQFKRWҩWFҧFiFQJ{QQJӳNKiFQKDX&{QJWUuQKKRjQWRjQVӱGөQJFiFWұS FRUSXVÿѫQQJӳFӫDQJ{QQJӳNKiFQKDXWUrQWKӃJLӟL3KѭѫQJSKiSQj\FҫQUҩWQKLӅX QJ{QQJӳÿӇWәQJKӧSÿѭӧFEӝHPEHGGLQJFKXQJYjQyFKӍÿҥLGLӋQFKRFiFÿһFWtQK FKXQJFӫDQJ{Q QJӳFKӭNK{QJÿһFWUѭQJULrQJFKRFһSQJ{QQJӳQjRQrQNK{QJSKKӧS YӟLPөFÿtFK[k\GӵQJEӝWӯÿLӇQVRQJQJӳULrQJELӋW 4 2.4 A. Conneau, G. Lample, M. Ranzato, L. Denoyer, H. -pJRX³:RUG WUDQVODWLRQZLWKRXWSDUDOOHOGDWD´DU;LYSUHSULQWDU;LY[4] &{QJWUuQKQJKLrQFӭXQj\ÿmÿѭDUDPӝWJLҧLSKiS[k\GӵQJP{KuQKKӑFNK{QJ JLiPViW+ӑFKӍVӱGөQJKDLQKyPÿѫQQJӳPӝWOjQJ{QQJӳQJXӗQYjPӝWOjQJ{QQJӳ ÿtFK3KѭѫQJSKiSFӫDKӑOj[k\GӵQJPҥQJÿһFELӋWPjWӵQyFyWKӇiQK[ҥWX\ӃQWtQK WӯNK{QJJLDQQJ{QQJӳQJXӗQWӟLNK{QJJLDQQJ{QQJӳÿtFKGӵDWUrQPӝWP{KuQKWrQOj PҥQJWӵVLQKÿӕLNKiQJ *$1 PjNK{QJFҫQFyGӳOLӋXVRQJQJӳÿӇKXҩQOX\ӋQ&{QJ WUuQKÿѭDUDJLҧLSKiSVӱGөQJP{KuQKWӵVLQKÿӕLNKiQJÿӇFyWKӇWӵVLQKUDEӝWӯÿLӇQ WӯYLӋFWӵFăQFKӍQKSKkQEӕWK{QJTXDFiFÿһFÿLӇPFӫDP{KuQK*$17{LFyVӱGөQJ JLҧLSKiSQj\FKRÿӅWjLNӃWKӧSYLӋFWUӵFJLDRKyDPDWUұQFӫDF{QJWUuQK2.2 FKREӝWӯ ÿLӇQWLӃQJ$QK± 9LӋWYjWLӃQJ3KiS± 9LӋW 5 3 &Ѫ6Ӣ LÝ THUYӂT 3.1 MҥQJQѫURQQKkQWҥo (Artificial Neural Network ANN 3.1.1 Giӟi thiӋu 0ҥQJQѫ-URQQKkQWҥRKD\WKѭӡQJÿѭӧFJӑLQJҳQJӑQOjPҥQJ Qѫ-URQÿѭӧFJLӟL WKLӋXQăPEӣL:DUUHQ0F&XOORFKYj:DOWHU3LWVOjPӝWP{KuQK[ӱOêWK{QJWLQÿѭӧF P{SKӓQJGӵDWUrQKRҥWÿӝQJFӫDKӋWKӕQJWKҫQNLQKFӫDVLQKYұWEDRJӗPVӕOѭӧQJOӟQ FiFQѫ-URQÿѭӧFJҳQNӃWÿӇ[ӱOêWK{QJWLQ7URQJPҥQJQѫ-ron nhân WҥRPӛLQѫ-ron là PӝWÿѫQYӏWtQKWRiQFyÿҫXYjRYjÿҫXUDOjFiFÿҥLOѭӧQJY{KѭӟQJ0ӛLÿҫXYjRFyPӝW WUӑQJVӕWѭѫQJӭQJYӟLQy1ѫ-URQQKkQPӛLÿҫXYjRFӫDQyYӟLWUӑQJVӕWѭѫQJӭQJFӝQJ WҩWFҧÿҫXYjROҥLiSGөQJPӝWKjPSKLWX\ӃQWtQKÿӇFKRUDNӃWTXҧӣÿҫXUD&iFQѫ-ron ÿѭӧFNӃWQӕLYӟLQKDXWKjQKOұSPӝWPҥQJOѭӟLÿҫXUDFӫDQѫ-URQQj\FyWKӇÿѭӧFWUX\ӅQ FKRÿҫXYjRFӫDPӝWKD\QKLӅXQѫ- URQNKiF1ӃXFiFWUӑQJVӕÿѭӧFWKLӃWOұSFKtQK[iF PӝWPҥQJQѫ-URQFyWKӇWtQKWRiQ[ҩS[ӍQKLӅXKjPWRiQKӑFSKӭFWҥS Hunh 2+uQKPLQKK͕DP̩QJQ˯URQQKL͉XOͣS KiӃn tr~c chung cӫa mӝt ANN gӗm 3 thjnh phҫQÿy lj ÿҫu vjo (input layer), tҫng ҭn (hidden layer) vj ÿҫu ra (output layer). Trong hunh 1, minh hӑa mӝt mҥng nѫ-ron cѫ bҧn vӟi 2 tҫng ҭn. Mӛi vzng trzn lj mӝt nѫ-ron, cic mNJi trQÿLYjo lj ciFÿҫu vjo vj cic mNJi trQÿLUDOj cic kӃt quҧ ÿҫu ra cӫa nѫ-URQÿy. Cic nѫ-URQÿѭӧc sҳp xӃp thjnh cic tҫng, biӇu diӉn luӗng th{QJWLQÿLTXDPҥng. Tҫng dѭӟi cng kh{ng cy bҩt kǤ mNJi trQÿLYjo, vj lj 6 ÿҫu vjo cӫa mҥng. Tѭѫng tӵ, tҫng trrn cng kh{ng cy bҩt kǤ mNJi trQÿLUDYj lj ÿҫu ra cӫa mҥng. Cic tҫng khiFÿѭӧc gӑi lj tҫng "ҭn". Kê hiӋXœErn trong cic nѫ-ron biӇu diӉn hjm phi tuyӃn ttnh (hjm ktch hoҥt) sigmoid = (1/(1 + eíx ÿѭӧc ip dөng vjo gii trӏ cӫa nѫ-ron trѭӟFNKLFKRUDÿҫu ra. Mӛi nѫ-ron ÿӅu kӃt nӕi tӟi tҩt cҧ cic nѫ-ron ӣ tҫng tiӃp theo - vu vұy nrQÿѭӧc gӑi lj tҫng "kӃt nӕLÿҫy ÿӫ". Gii trӏ cӫa mӛi tҫng trong mҥng nѫ-ron cy thӇ ÿѭӧc xem lj mӝt vector. Trong hunh 13, tҫng ÿҫu vjo lj mӝt vector 4 chiӅu (x), vj tҫng trrn ny lj mӝt vector 6 chiӅu (h1). Tҫng fullyconnected cy thӇ ÿѭӧc xem lj mӝt phpp biӃQÿәi tuyӃn ttnh mӝt vector tӯ 4 chiӅu thjnh 6 chiӅu. Mӝt tҫng fully-connected hiӋn thӵc mӝt phpp nhkn ma trұn: h = xWWURQJÿy trӑng sӕ cӫa kӃt nӕi tӯ nѫ-ron thӭ i cӫa tҫng trѭӟc ny tӟi nѫ-ron thӭ j cӫa ny lj Wij. Gii trӏ cӫa h VDXÿy ÿѭӧc biӃQÿәi bҵng mӝt hjm phi tuyӃn ttnh g vj truyӅn cho tҫng tiӃp theo. 3.1.2 Các hàm kích hoҥt Cy rҩt nhiӅu dҥng hjm phi tuyӃn ttnh cy thӇ sӱ dөng cho cic tҫng ҭn. HiӋn tҥi kh{ng cy lê thuyӃt njo vӅ viӋc sӱ dөng hjm phi tuyӃn ttnh njo trong trѭӡng hӧp njo, vj cich chӑn hjm phi tuyӃn ttnh thtch hӧp cho mӝt tic vө cө thӇ trong thӵc nghiӋm. Trong sӕ cic hjm phi tuyӃn ttnh, cic hjPVDXÿѭӧc sӱ dөng nhiӅu nhҩt: sigmoid, tanh, hard tanh, rectified linear unit (ReLU), và Leaky ReLU. x Tanh Hjm tanh cy c{ng thӭc tanhሺ‫ݔ‬ሻ ൌ  ௘ 2ೣି1 ௘ 2ೣା1 FyGҥQJFKӳ6biӃQÿәi gii trӏ x vjo miӅn [-1, 1]. Hunh 3Ĉ͛WK͓KjPWDQK 7 x Sigmoid ଵ Hàm Sigmoid có công thӭc ߪሺ‫ݔ‬ሻ ൌ  ଵା௘ షೣ FyGҥQJFKӳ6ELӃQÿәLJLiWUӏ[YjRPLӅQ>@ Hunh 4Ĉ͛WK͓KjP6LJPRLG x ReLU Hjm ReLU, lj mӝt hjm phi tuyӃn ttQKÿѫn giҧQÿӇ sӱ dөng vj cho kӃt quҧ rҩt tӕt trong thӵc nghiӋm. Hjm ReLU sӁ biӃn mӛi gii trӏ x < 0 thjnh 0. Mһc d ÿѫn giҧn nhѭng ReLU lҥi hiӋu quҧ vӟi nhiӅu tic vөÿһc biӋt lj khi kӃt hӧp vӟi kӻ thuұt dropout regularization. Hjm ReLU cy c{ng thӭc dҥng: ܴ݁‫ܷܮ‬ሺ‫ݔ‬ሻ ൌ  ቊ 0‫ ݔ‬൏ 0 ‫݁ݏ݅ݓݎ݄݁ݐ݋ݔ‬ Hunh 5: Ĉ͛WK͓KjP5H/8 8 x Leaky ReLU /HDN\5H/8OjFҧLWLӃQWURQJYLӋFORҥLEӓYҩQÿӅG\LQJ5H/87KD\YuWUҧYӅJLiWUӏYӟL FiFÿҫXYjRWKu/HDN\5H/8WҥRUDPӝWÿѭӡQJ[LrQFyÿӝGӕFQKӓ&{QJWKӭFFӫD/HDN\ 5H/8QKѭVDX ܴ݁‫ܷܮ‬ሺ‫ݔ‬ሻ ൌ  ቊ ߙ‫ ݔݔ‬൏ 0ǡߙ݈à‫ݏ‬ዎ‫ݎ‬ኸ‫݄݊ݐ‬ው ‫݁ݏ݅ݓݎ݄݁ݐ݋ݔ‬ Hunh 6Ĉ͛WK͓KjP/eaky ReLU 3.1.3 Hàm chi phí mҩt mát CNJng giӕng nhѭ khi huҩn luyӋn mӝt bӝ phkn loҥi tuyӃn ttnh, khi huҩn luyӋn mӝt mҥng nѫron ta cNJng phҧLÿӏnh nghƭa mӝt loss function ‫ܮ‬ሺ‫ݕ‬ො ǡ ‫ݕ‬ሻ, thӇ hiӋn mҩt mit cӫa viӋc tirQÿRin ‫ݕ‬Ƹ khi kӃt quҧ chtnh xic lj y. Mөc tiru cӫa viӋc huҩn luұn lj giҧm thiӇu tӕLÿDPҩt mit cӫa tҩt cҧ cic mүu huҩn luyӋn khic nhau. Hjm ‫ܮ‬ሺ‫ݕ‬ො ǡ ‫ݕ‬ሻ cho ra mӝWÿLӇm sӕ (v{ hѭӟQJ FKRÿҫu ra cӫa mҥng ‫ݕ‬Ƹ vӟi kӃt quҧ mong muӕn lj y. Mҩt mit lu{n lu{n dѭѫng vj chӍ bҵng 0 trong trѭӡng hӧSÿҫu ra cӫa mҥng lj chtnh xic. Cic tham sӕ cӫa mҥng (ma trұn Wi, bias bi ÿѭӧc chӍnh sӱDÿӇ tӕi thiӇu hya mҩt mit trrn tojn tұp huҩn luyӋn (th{ng thѭӡng thu tәng cic mҩt mit cӫa cic mүu huҩn luyӋn khic nhau sӁ ÿѭӧc tӕi thiӇu hya). Mҩt mit cy thӇ lj mӝt hjm bҩt kǤ chiӃu hai vector thjnh mӝWÿҥi lѭӧng v{ hѭӟng. Vu mөc ÿtch tӕi ѭu hya trong thӵc tӃ cӫa viӋc huҩn luyӋn, hjm mҩt mit thѭӡQJÿѭӧc giӟi hҥn trong cic hjm thuұn lӧi cho viӋc ttnh gradient. Cic hjm mҩt mit th{ng dөng lj: hinge loss (nhӏ phkQ KLQJHORVV ÿDOӟp), log loss, categorical cross-entropy loss, ranking loss 9
- Xem thêm -

Tài liệu liên quan