Aside from the ability to 'grab' this text, there are also numerous functions for manipulating this text once it has arrived - removing line spaces, removing dates, copyright statements (!), removing lines, coverting the case of the text etc. The DLL also provides functions for detecting if two titles are in a similar category, and categorising headlines into a specified group, however don't expect miracles... it appears to simply search the title for a keyword you specify, rather than interpreting the meaning of the text, and what it might relate to!
Another great feature is the ability to automatically fill out and submit a form on the internet, and then extract the text from the resulting page. WebZinc also provides useful properties returning the domain of a website, whether frames are present (and if so, you can retreive the main page URL), get the last updated date of a webpage, return its page title, and the whole of its text.
Comments