What we’re trying to do today is get a value from a webpage and return it to a custom function within Excel. I’m going to use a very simple example, and it should be inferred that this is only for very simple tasks. The information you seek to return to your function will need to be presented on the webpage the same way every time you visit it, because all you’re doing is skipping the web browser, bringing the background contents of the webpage into a string, and parsing the string to pull out the data you want.
What I mean by the “background contents” is the code behind all webpages. When your browser calls up a URL, it receives a bunch of code, which it uses to render the page to you. If you’re unfamiliar with HTML code, you may have a bit more difficulty with this, but all you really need to be able to do is recognize where your information is always kept within this code. This sounds more complicated than it is, really.
For my example, I want to create a function called “temperature” to which I can pass a city and state, and from which I can gather the current temperature of that city. I live in Orlando, FL, so that is the temperature I want to know.
Important: The first stumbling block in this project is the need for a special reference from within Excel VBA — the Microsoft Internet Transfer Control reference. You may need to download msinet.ocx (you can find it pretty easily through popular search engines) in order to add this reference. To add a reference, click Tools -> References from within the VBA window, locate the reference, and select it. If the reference is not available, download msinet.ocx, put it somewhere easy like \Windows\System, and then locate the file via “Browse…” from the references window.
The code for my example function follows, and I will explain everything in the following paragraphs:
Function temperature(strCity as string)
strCity = Replace(strCity, ", ", "%2C+")
Dim myURL As String
Dim inet1 as inet
Dim mypage As String
Set inet1 = New Inet
.Protocol = icHTTP
.URL = myURL
mypage = .OpenURL(.URL, icString)
Set inet1 = nothing
Dim intStart As Integer, intEnd As Integer
intStart = InStr(mypage, "<span class=""wea_temp""></span>") + 22
intEnd = InStr(intStart, mypage, "&")
temperature = mid(mypage, intStart, (intEnd - intStart))
I’ll take you line-by-line through this somewhat lengthy example function. To begin, the name “temperature” and the parameter strCity means that when you input this function into a cell within Excel, you would type “=temperature” followed by the city and state in quotes and parenthesis, i.e. ‘=temperature(“Orlando, FL”)’.
The second line takes strCity and replaces the comma with “%2C+”. The reasoning for this is to properly construct the URL. In the language of a URL, %2C means “comma”, and the + denotes a space. You can’t have spaces or punctuation marks in URLs.
Then we declare myURL, which is what we would have typed into the web browser. If you were to go to msn.com and search for something (like ‘current temperature Orlando, FL’), then looked at the address bar, you would see something like this. In fact, that’s how I constructed the string, and that’s how I urge you to go about this as well. We add our city and state – gathered from the function parameter – to the end of the string.
The next part gets a little vague for me, because I haven’t had much experience with the Internet Transfer Control. Like most controls, you need to first point a variable to it (“Dim inet1 as inet”), and then initialize it (“Set inet1 = New Inet”). In our example, we need to worry about three aspects of the Inet control – Protocol, URL, and OpenURL. The first two are parameters, and in our case will be icHTTP for the protocol, and the URL we have already constructed as the URL.
The third (OpenURL) is the method we’re using to hit the URL. The two parameters you need to pass to it are the URL to hit and what to do with the result. Since we’re only interested in the webpage itself and not any files that might be downloaded, we just want to return the value to a string, hence, we tell it to use the icString data type. icString is a data type unique to the Internet Transfer Control, and unfortunately, I don’t know much about it except that it can be transferred to a standard string data type.
We complete the web hit with the statement “Set inet1 = nothing” in order to clear out our memory. This is just good practice.
This is the fun and somewhat difficult part. If you search in msn for the temperature in Orlando, FL and then take a look at the source of the website through a standard browser, you’ll see a lot of stuff that means nothing to you. Buried in all of this stuff is the current temperature. The easiest thing to do in this case was to search msn, get the actual temperature, and then search for that number in the source. I then noticed that the temperature occurs directly after the first instance of the phrase ‘<span class=”wea_temp”>’. The InStr function tells me what character within the HTML code begins that phrase, and as the phrase is 23 characters long, we add 22 to that number. Immediately after the temperature was the character “&”, so we perform another InStr function (passing our first value as the start point for that function) to get the first instance of “&” following the temperature. The temperature is then recorded using the mid function, which asks for the string, the start value, and the length of the string you’re asking for (which would be the position of the “&” minus the position of the first character of the temperature).
Like I said, that’s the difficult part. You have to be able to read through all that code to find your value, and you have to play around with the InStr and mid functions to get the right numbers. This is also why it’s vital that you only do this with static pages. I know msn.com will always return the temperature of the city and state I search for at the top of the search results, so I am confident that this function won’t break. But if msn.com put the temperature arbitrarily in the list of search results, this would be substantially more difficult.
UPDATE: For a second example of how this might be put to use in order to calculate drive times between two addresses, latitude and longitude pairings, zip codes, etc., see my post http://guj800.wordpress.com/2011/04/27/drive-time-calculations-in-excel/