Mentioning the handling of natural languages does not necessarily involve grand schemes like speech recognition. A web developer, for example, may run into tasks like name validation, Romanizing page titles, or tokenizing sentences for indexing purposes.
There are two aspects in those daily tasks: method and knowledge. This talk first gives a brief introduction on basic techniques and tools ("method") for software developers. Systematic information of a particular language ("knowledge") is the other aspect, and many open data projects can provide us with such information.
This talk will also briefly cover Formosa, a library for the Taiwanese languages, written in Ruby and C++, and shares some thoughts on how open source and open data projects can be organized to help in this field.