Septina Dian Larasati

Natural Language Processing
Machine Learning Software Engineering

septina.larasati [at] gmail [dot] com

About Me

I am a Natural Language Processing (NLP) researcher and engineer specialized in under resourced languages. I spent most of my researcher in general computational linguistics and Indonesian. I am best known as the owner of MorphInd: Indonesian Morphological Analyzer.

I was a Marie Curie Early Stage Researcher at SIA Tilde, Latvia, under CLARA (Common Language Resources and their Applications), an EU FP7 project. I was an Erasmus Mundus Scholarship Awardee in LCT (Language and Communication Technology) program. I spent the program in Bolzano, Italy and Prague, Czech Republic. I got my Double Master Degree from Free University of Bozen-Bolzano and Charles University in Prague in 2010. My undergrad degree is in Computer Science from University of Indonesia at the faculty of Computer Science in 2007.

Erasmus Mundus Joint Master Degrees | Erasmus+  MARIE SKŁODOWSKA-CURIE ACTIONS Research Fellowship Programme AWS Certified Solutions Architect – Associate

Resume

Education

2010 – ABD

Ph.D., Natural Language Processing

Marie Curie CLARA EU FP7 Early Stage Researcher Grant
Charles University in Prague, Czech Republic

2008 – 2010

M.Sc., 2010. Master in Language and Communication Technologies

European Masters Scholarship in Language & Communication Technologies
Charles University in Prague, Czech Republic
Free University of Bozen-Bolzano, Italy

2003 – 2007

B.Sc., 2007. Bachelor of Computer Science

University of Indonesia

Project

MorphInd: Indonesian Morphological Analyzer

MorphInd is a robust finite state morphology tool for Indonesian, that handles both morphological analysis and lemmatization for a given surface word form so that it is suitable for further language processing. MorphInd consists of morphosyntactic and morphophonemic rules to analyze Indonesian derivational or inflectional surface words. MorphInd is designed specifically for Indonesian.

IDENTIC v.10

IDENTIC v.10 Indonesian – English Parallel Corpus. IDENTIC v.10 is available at Lindat Repository

Munch Gifts

Munch Gifts is my Etsy store.

Language Kits Morphology

This is an initial Morphology API project for several languages under Language Kits. This project is developed using Natural Language Processing techniques, React, AWS (S3, Lambda, DynamoDB), beautified using Semantic UI. demo

Publication

S.D. Larasati and N. Green , “The First 100 Days: A Corpus Of Political Agendas on Twitter, Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 2018 [pdf] [bib]

S.D. Larasati, T. Järvinen, E. Bertol, M.M. Rizea, M.R. Santabalbina, and M. Souček, “Towards Cross-language Application of Dependency Grammar, Proceedings of the 3rd International Conference on Dependency Linguistics (Depling 2015), Uppsala, Sweden, 2015 [pdf] [bib]

S.D. Larasati and N. Green , “Votter Corpus: A Corpus of Social Polling Language”, Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), Reykjavik, Iceland, 2014 [pdf][bib]]

S.D. Larasati, R.H. Susanto, and F.M. Tyers, “Rule-based Machine Translation between Indonesian and Malaysian”, Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012) WSSANLP Workshop, Mumbai, India, 2012 [pdf] [bib]

S.D. Larasati, “Handling Indonesian Clitics: A Dataset Comparison for an Indonesian-English Statistical Machine Translation System”, Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation (PACLIC 26), Bali, Indonesia, 2012 [pdf] [bib] Presented 11/2012

S.D. Larasati, N. Green, and Z. Žabokrtský, “Indonesian Dependency Treebank: Annotation and Parsing”, Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation (PACLIC 26), Bali, Indonesia, 2012 [pdf] [bib] Presented 11/2012

S.D. Larasati, “Improving Word Alignment by Exploiting Adapted Word Similarity”, Proceedings of the 10th Association for Machine Translation in the Americas Conference (AMTA 2012) MONOMT Workshop, San Diego, USA, 2012 Presented 11/2012

S.D. Larasati, “Towards an Indonesian-English SMT System: A Case Study of an Under-Studied and Under-Resourced Language, Indonesian”, Proceedings of the 21st Annual Student Conference, Week of Doctoral Students (WDS 2012), Prague, Czech Republic, 2012 [pdf] Presented 05/2012

S.D. Larasati, “IDENTIC Corpus: Morphologically Enriched Indonesian-English Parallel Corpus”, Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, 2012 [pdf][bib] Presented 05/2012

S.D. Larasati, V. Kuboň, and D. Zeman, “Indonesian Morphology Tool (MorphInd): Towards an Indonesian Corpus”, Proceedings of Workshop on Systems and Frameworks for Computational Morphology (SFCM 2011), Zurich, Switzerland, 2011 [Springer] Presented 08/2011

S.D. Larasati and V. Kuboň, “A study of Indonesian-to-Malaysian MT System”, Proceedings of Workshop on Malay and Indonesian Language Engineering (MALINDO), Jakarta, Indonesia, 2010 [pdf] Presented 08/2010

S.D. Larasati, R. Manurung, and R. Mahendra, “Extending an Indonesian Semantic Analysis-based Question Answering System with Linguistic and World Knowledge Axioms”, Proceedings of the 22nd Pacific Asia Conference on Language, Information, and Computation (PACLIC 22), Cebu City , Philippines, 2008 [pdf] [bib]

S.D. Larasati and R. Manurung, “Towards a Semantic Analysis of Bahasa Indonesia for Question Answering”, Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics (PACLING 10), Melbourne, Australia, 2007 [pdf]