出嚟食飯GPT-4 is a very good Hongkongese POS TaggerI have been working on word segmentation and part-of-speech tagging for Hongkongese. Then GPT-4 came out, and I heard its…Mar 20, 2023Mar 20, 2023
出嚟食飯Building a Hongkongese Word SegmenterIn my previous story, I evaluated the performance of several NLP systems against Hong Kong data. One of the top performers is the CKIP…Mar 12, 20231Mar 12, 20231
出嚟食飯Evaluating Cantonese Performance in NLP SystemsIn natural language processing(NLP), often the first step is to tokenize a string, then the second step is to annotate the tokens with…Jan 4, 2023Jan 4, 2023
出嚟食飯Archive to the Stand News Articles on Trial for Sedition in Hong Kong 2Continuing to provide archive links to articles pertaining to case DCCC265/2022. Article list collected from the court news site The…Dec 23, 2022Dec 23, 2022
出嚟食飯Archive to the Stand News Articles on Trial for Sedition in Hong KongAt Toasty News, we obviously care deeply about news. Currently, there is a case against Stand News, the former editor-in-chief Chung…Nov 2, 2022Nov 2, 2022
出嚟食飯Hongkongese Usage on Independent News Sites in 2020At Toasty News, we collect and publish statistics about Hongkongese usage on independent news sites from Hong Kong. For previous reports…Apr 11, 20211Apr 11, 20211
出嚟食飯ABC Cantonese-English Comprehensive Dictionary ReviewWhen I was younger, I loved reading the dictionary. It was fun to read random entries and imagine the situations when the words would be…Apr 5, 2021Apr 5, 2021
出嚟食飯Hong Kong Transformer Models And Other NLP ResourcesI got access to TensorFlow Research Cloud for a month, so I spent most of that time training transformer models with Hongkongese data…Jul 8, 20201Jul 8, 20201
出嚟食飯State of the art approach to Chinese Script Conversion: 2kenizeThere are two character sets for writing Chinese characters in the world, Simplified Chinese (SC) and Traditional Chinese (TC). TC is…May 12, 2020May 12, 2020