cancel
Showing results for 
Search instead for 
Did you mean: 

"Parsing" a web page for useful information

badwolf
Grafter
Posts: 88
Registered: ‎01-09-2007

"Parsing" a web page for useful information

Hi Guys,
Not sure if anyone can help with this, but will give it a shot
At the place I work we have a "building management system" with all kinds of sensors dotted around the place, which measure temperature, gas/electric consumption, humidity etc. You can login to the thing and get a webpage up with all of these readings listed, but I am only interested in a few of these.
I would like to display some of these readings onto a custom webpage - lets assume my webpage is hosted internally and has access to the building management system.
So, what I assume I need to do is somehow "read in" the contents of the html output page of the building management system, keep the useful bits (ie current temperature) and get rid of everything else, and then put this onto my custom webpage, all automatically! Perhaps only updated whenever the page is refreshed (ie doesnt have to be real time or anything!)
As the system that displays my webpage can not run PHP/IIS etc I would guess any processing would have to be done client side? Java perhaps?
Anyone ever done anything like this before!?
Thanks! Cheesy
4 REPLIES
hadden
Grafter
Posts: 486
Thanks: 2
Registered: ‎27-07-2007

Re: "Parsing" a web page for useful information

Generally, it is possible to modify external web pages using client-side javascript to adjust the style of those web pages (after they are loaded into the browser), for example that's what advert blocking add-ons do Smiley The script can be written to hide the unwanted parts of the web page, thus leaving only the bits that you want.
There are a few ways of doing this depending upon the number of end users of your modified page and the browsers that they use. However all methods will likely require an understanding of HTML, javascript and CSS.
For example, if the result page was only required to be viewed on one PC and that PC was using Firefox then you could use the Stylish or Greasemonkey add-ons (I understand that Greasemonkey is also available for IE). Then you could examine the source HTML of the current web page, work out how to identify the required parts to keep and write the necessary script into one of the above add-ons.
If there weren't too many PCs, then you could extend the same method to each PC.
However if the above is not practical, then I suspect a complete custom page could be written that embeds the original web page and also includes the scripts required to modify that content by hiding the unwanted parts. That custom page could then be viewed by any number of PCs, if it was stored in a shared network location.
Community Veteran
Posts: 14,439
Thanks: 728
Fixes: 12
Registered: ‎01-08-2007

Re: "Parsing" a web page for useful information

I'm pretty sure however you do this client OR server side you're going to need to use some pattern matching techniques using regular expressions. Javascript does apparently support Regex's too but I'm only just venturing into using them in php myself.
It IS doable but personally I'd prefer to setup another WAMP setup and use php with curl to download the page, search the patterns wanted and then output the final result on a new page.
I need a new signature... i'm bored of the old one!
Superuser
Superuser
Posts: 3,500
Thanks: 1,953
Fixes: 12
Registered: ‎10-04-2007

Re: "Parsing" a web page for useful information

I'm certain that you could make a VERY good attempt at doing what you want using AutoIT.
I've used for a number of tasks and once you get over the learning hurdle, it's a very flexible and useful tool to have in the kit.  Good Forum and lots of example scripts to get you started.
Maurice
Example:  I wrote a script to automate getting my Profile and Exchange information from the Plusnet and Usertools sites and output them to a file.  Not very complex using the Firefox or Explorer User templates as a starter.
VileReynard
All Star
Posts: 11,191
Thanks: 306
Fixes: 11
Registered: ‎01-09-2007

Re: "Parsing" a web page for useful information

If you have Linux, you could parse a web page http://checkip.dyndns.org a very simple version below Cheesy
Quote
#! /bin/bash
date_stamp=$(date +%d/%m/%Y@%X)
exip=$(wget -qO - checkip.dyndns.org | grep -Eo '[0-9\.]+')
echo "$date_stamp $exip" >> /var/log/whatismyip.log
Cheesy