β‘ Web | β Blog | π¦ Twitter | π Youtube | β Coffee
π Currently working on gathering texts on the Web and detecting word trends
π© First programs written on a TI-83 Plus in TI-BASIC
Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
Forked from saffsd/langid.py
Faster, modernized fork of the language identification tool langid.py
Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German