การสร้างแบบจำลองหัวข้อ (Topic Modelling) เพื่อการวิเคราะห์ข้อความภาษาไทยจากสื่อสังคมออนไลน์ ร่วมกับบริษัท ฟีดแบค 180 จำกัด

รัชฎา คงคะจันทร์

dc.contributor.author	รัชฎา คงคะจันทร์	th
dc.date.accessioned	2020-10-09T08:29:17Z
dc.date.available	2020-10-09T08:29:17Z
dc.date.issued	2563-10-09
dc.identifier.uri	https://repository.turac.tu.ac.th/handle/6626133120/909
dc.description.abstract	จากการที่ปัจจุบันข้อมูลข่าวสารในสื่อสังคมออนไลน์ที่มีลักษณะเป็นเอกสารข้อความภาษาไทย (Text) มีปริมาณเพิ่มมากขึ้นอย่างรวดเร็ว ทั้งจำนวนเอกสารและความหลากหลายของเนื้อหาในเอกสาร หากสามารถนำข้อมูลและสารสนเทศที่อยู่ในเอกสารข้อความภาษาไทยเหล่านั้นมาวิเคราะห์ เพื่อสกัดข้อมูลที่สำคัญและเป็นประโยชน์ออกมาได้ ก็จะสามารถนำไปช่วยในการตัดสินใจทางธุรกิจและพัฒนาศักยภาพขององค์กรธุรกิจและหน่วยงานภาครัฐภายในประเทศได้ ปัญหาที่สำคัญอย่างหนึ่งของการสกัดสารสนเทศออกจากเอกสารข้อความภาษาไทย คือ ความสามารถในการระบุหัวข้อ (Topic) ของเอกสารจากเนื้อหาข้อความภายในเอกสาร โดยการพิจารณาจากความสัมพันธ์ระหว่างค่าหรือกลุ่มคำที่อยู่ในหัวข้อเดียวกัน ด้วยเหตุนี้ โครงการวิจัยนี้จึงมีจุดประสงค์เพื่อค้นหาวิธีการในการสร้างแบบจำลองหัวข้อ (Topic Modeling) จากกลุ่มเอกสารข้อความภาษาไทยในสื่อสังคมออนไลน์ โดยมีความสามารถที่จะระบุหัวข้อของชุดเอกสาร ซึ่งในแต่ละเอกสารสามารถประกอบด้วยจำนวนหัวข้อที่มากกว่าหนึ่งหัวข้อได้ พร้อมทั้งระบุความเกี่ยวข้องระหว่างแต่ละหัวข้อกับเนื้อหาในแต่ละเอกสาร เพื่อวิเคราะห์หาคะแนนความเกี่ยวพันของแต่ละหัวข้อในเอกสารดังกล่าวได้ The social media online has popular on the internet. The data comes from the social media online is increasing in every second. Most of the available data is text which is in unstructured format. To analyze these data, we need to automated extract the desired information for fueling the organization’s business decisions to improve their products or services to serve customer needs. The important process for information extraction is to identify topics from text documents. Topic Modelling (TM) refers to automate the extraction of topic from unstructured sources. Keyword extraction is a part of TM to discover implicit and potentially important keywords in underlying unstructured natural-language texts. Due to the inherent characteristic of Thai written language which does not explicitly use any word delimiting characters, identifying individual words. In this project, an alternative method for word-formation for noun phrase recognition is proposed. The word-formation is improving keyword extraction using the compound noun pattern. We use the word-formation to applying the TextRank algorithm to group the noun phrase, there are selected as candidates to calculate in the algorithm. The dataset for experiments are 2,727 documents in the banking domain from social online such as Facebook, Twitter, and online news. The experimental results yield 47.10% of accuracy with significant improvement by word-formation. According, the keyword have effective for TM.	th
dc.format.mimetype	application/pdf	th
dc.language.iso	tha	th
dc.publisher	สำนักงานศูนย์วิจัยและให้คำปรึกษาแห่งมหาวิทยาลัยธรรมศาสตร์	th
dc.rights	เอกสารฉบับนี้สงวนสิทธิ์โดยผู้ให้ทุน ห้ามทำซ้ำ คัดลอก หรือนำไปเผยแพร่ตัดต่อโดยมิได้รับอนุญาตเป็นลายลักษณ์อักษร	th
dc.subject	สร้างแบบจำลองหัวข้อ	th
dc.subject	วิเคราะห์ข้อความภาษาไทย	th
dc.subject	สื่อสังคมออนไลน์	th
dc.subject	บริษัท ฟีดแบค 180 จำกัด	th
dc.title	การสร้างแบบจำลองหัวข้อ (Topic Modelling) เพื่อการวิเคราะห์ข้อความภาษาไทยจากสื่อสังคมออนไลน์ ร่วมกับบริษัท ฟีดแบค 180 จำกัด	th
dc.title.alternative	Topic modelling in Thai language from social media messages in corporation with Feedback 180 Co. Ltd.,	th
dc.type	Text	th
dcterms.accessRights	บุคคลทั่วไปสามารถเข้าถึงเอกสารนี้ได้	th
dc.rights.holder	บริษัท ฟีดแบค 180 จำกัด	th
cerif.cfProj-cfProjId	2563A00794	th
mods.genre	รายงานวิจัย	th
turac.projectType	โครงการวิจัย	th
turac.researchSector	สาขาเทคโนโลยีสารสนเทศและการสื่อสาร (Information and Communication Technology sector : ICT)	th
turac.contributor.client	บริษัท ฟีดแบค 180 จำกัด
turac.fieldOfStudy	วิทยาศาสตร์และเทคโนโลยี	th
cerif.cfProj-cfTitle	การสร้างแบบจำลองหัวข้อ (Topic Modelling) เพื่อการวิเคราะห์ข้อความภาษาไทยจากสื่อสังคมออนไลน์ ร่วมกับบริษัท ฟีดแบค 180 จำกัด	th
cerif.cfProj-cfProjStatus	สิ้นสุดโครงการ	th

Files in this item

Name:: no fulltext.doc
Size:: 21.5Kb
Format:: Microsoft Word

View

This item appears in the following Collection(s)

โครงการวิจัย [72]
(Research projects)

Show simple item record