The basic idea behind WebZinc is the ability to 'grab' text from a webpage, and most importantly, WebZinc makes this a cinch! The DLL provides functions to capture paragraphs, links in the text, table cells and rows, and even pictures. What's more, there's plenty of options for how to grab these. For example, you can initially grab a paragraph by its number, or a paragraph following a particular keyword, or other paragraph. For getting pictures, you can specify its dimensions, its link, its name, or simply capture all images on the page, alongside all the URLs. One thing it doesn't let you do directly is download the picture itself, but as you are most likely going to be inserting an img tag into another HTML page, it doesn't really matter. Of course, you can still use its FTP functions to download the image and save it to a local file if you need to anyway.
All of these functions return the page text rather than any HTML, which makes it easy for then displaying in your application. You can return the whole pages HTML, but what would be nice is the ability for these text-grabbing functions to be able to return the related HTML too. You can even specify rules for the text that is being imported, allowing it to automatically ignore or replace text before it is returned to your program. For example, removing all links containing 'internet.com'.
Comments