[Talk] Open Source Tools, Open Data, and Daily Tasks of Handling Natural Languages

Mentioning the handling of natural languages does not necessarily involve grand schemes like speech recognition. A web developer, for example, may run into tasks like name validation, Romanizing page titles, or tokenizing sentences for indexing purposes.

There are two aspects in those daily tasks: method and knowledge. This talk first gives a brief introduction on basic techniques and tools ("method") for software developers. Systematic information of a particular language ("knowledge") is the other aspect, and many open data projects can provide us with such information.

This talk will also briefly cover Formosa, a library for the Taiwanese languages, written in Ruby and C++, and shares some thoughts on how open source and open data projects can be organized to help in this field.

Categories

0 TrackBacks

Listed below are links to blogs that reference this post: [Talk] Open Source Tools, Open Data, and Daily Tasks of Handling Natural Languages.

TrackBack URL for this entry: http://osdc.tw/cgi-bin/mt-tb.cgi/326

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

Schedule

About This Post

This page contains a single entry by posted on February 20, 2008 10:25 AM.

[Talk] 日常生活的語言處理:開放工具與資料的應用 was the previous post in this blog.

[Speaker] Lukhnos is the next post in this blog.

Find recent content on the main index or look in the archives to find all content.

Powered by Movable Type 4.32-en
hosted by PhotonVPS