Suntek Computer Systems

Technical Notes


ChatGPT Considered Harmful

Yes, we borrowed our title from Dijkstra’s famous “Go To Statements Considered Harmful”. There is some relevance between the two. Like goto, which is very powerful (some form of it is needed in every programming language) but must be used carefully or else it could make codes error prone, ChatGPT is widely known as extremely powerful, but it must be treated carefully; if not, it could present to you as truth inconsistent or outright wrong information.


From information retrieval to search engines

Search engine has its root in information retrieval, which had undergone over half a century of research. How is information retrieval evolved into the search engines we see today?


Selecting Index Terms from a Document

Most web search engines today index every single word on a page because they have abundant storage and processing power. At the same, they can cater for the most unexpected queries (e.g., searching for exact phrases that contain stopwords). However, in general, it is not advisable to index every word but only those that have the highest values. How are values defined?


Extended Boolean Model

It has been well-known that the Boolean model is too inflexible, requiring skilful use of Boolean operators to obtain good results. On the other hand, the vector space model is flexible but not precise enough. Is there a middle ground?


×
full screen images