Drive Time Calculations in Excel

Building on my most popular post, Getting Information From the Web Using Excel VBA, I had the need in a recent project to calculate drive times on many rows of data, and decided to build a function in Excel to handle the dirty work for me using Google Maps.

Essentially, we had 200 or so latitude/longitude points and needed to see which of six addresses were closer from a drive time perspective. Luckily, Google came to the rescue once again, because one can use latitude/longitude coordinates in lieu of an address in order to get directions (which includes drive time) to a physical address (or, I suppose, a second set of latitude/longitude coordinates). What’s more, the URL for google maps to give you this information is fairly simple:

http://maps.google.com/maps?q=from: [PointA] to: [PointB]

So, once I dug through the HTML code behind the google maps directions results to find the div element containing the drive time of the quickest route, it was easy enough to create the following function. The two parameters are the addresses, zip codes, coordinates, or whatever else Google will allow you to use to approximate the starting and ending points.

Function DriveTime(PointA As String, PointB As String)

  Dim myURL As String
  myURL = _
    "http://maps.google.com/maps?" & _
    "&q=from: " & PointA & " to: " & PointB

  Dim inet1 As Inet
  Dim mypage As Variant

  Set inet1 = New Inet
  With inet1
    .Protocol = icHTTP
    .URL = myURL
    mypage = .OpenURL(.URL, icString)
  End With
  Set inet1 = Nothing

  Dim intStart As Double, intEnd As Double
  intStart = InStr(mypage, "<div class=""altroute-rcol altroute-info"">") + 41
  intEnd = InStr(intStart, mypage, "</div>") - intStart
  DriveTime = Mid(mypage, intStart, intEnd)

End Function

It took about 45 seconds to calculate the drive time for appx. 6 * 200 or 1,200 routes. My only other option was to copy and paste those 1,200 coordinates one-by-one into Google maps and retype the drive time. Which probably would have taken at least half a day. I call that a win.

Feel free to use this function for any number of drive time calculations, but I would assume that at some point Google will pitch a fit that your IP is sending so many requests to its map server. However, it didn’t bat an eye at my 1,200 queries, so who knows? Enjoy!

Advertisements

Next Unique and Previous Unique

Why this isn’t already a function of Excel baffles me.  Perhaps my work is somewhat unique.  I generally work with very large datasets (over 100,000 rows, often times nearing the 1.08 million row limit of Excel 2007), and I often find myself needing to scroll through the data in order to find the next value in a series.

Consider a spreadsheet with three columns: Fruit, Name, and Score.  Imagine this is populated with the data of a survey of 100,000 people and their opinions of Apples, Bananas, and Oranges.  You can assume that with three fruit types, there would be 300,000 total rows.  Now imagine opening this file in Excel, and imagine it is already sorted by Fruit.  How would you go about locating the first entry of Bananas?

The obvious method is to simply use the scroll bar and drag down until Bananas appear.  A second method would be to press and hold Page Down until this section appears.  In either case, it is far too easy to surpass the intended row, requiring a similar method in the opposite direction.  This can sometimes force you to go back and forth a few times until homing in on that row.

Now imagine that you just want to access the last row in the data, regardless of the contents of the cells.  Just type Ctrl + Down Arrow, and you’re there.  Wouldn’t it be nice if there were a similar function for skipping down to the next unique value?  Well, there can be!

Consider the following VBA functions:

Sub findFirst()
   Dim targetString as String
   targetString = ActiveCell.Text

   If ActiveCell.row = 1 Then
      Exit Sub    
   ElseIf Cells(ActiveCell.row - 1, ActiveCell.Column).Text <> targetString Then
      Cells(ActiveCell.row - 1, ActiveCell.Column).Select
   Else
      ActiveSheet.Columns(ActiveCell.Column).Find(targetString, 
        LookIn:=xlValues).Select
   End If
End Sub
Sub findLast()
   Dim targetString As String
   targetString = ActiveCell.Text

   If Cells(ActiveCell.row + 1, ActiveCell.Column).Text <> targetString Then
      Cells(ActiveCell.row + 1, ActiveCell.Column).Select
   Else
      ActiveSheet.Columns(ActiveCell.Column).Find(targetString,
        SearchDirection:=xlPrevious, LookIn:=xlValues).Select
   End If
End Sub

In both instances, we’re using the built-in function Excel uses to find values, similar to using Find or Find/Replace.  In the first function, we first check to make sure we’re not already at the first row, and if not, we use the find function to locate the first instance of the current value in the current column.  So, using our original example, if we were midway through the entries of Apples and wanted to return to the first Apple entry, this function would look at the Fruit column and search for the first instance of “Apple”.

The second function does a similar act, but uses the xlPrevious search direction.  Therefore, it starts at the first cell in the current column, and searches backwards for the current value.  This requires Excel to start from the bottom of the spreadsheet and search upwards.  So, if we are halfway through the “Apple” entries and use this function, Excel will start at cell A1048576 and search upward until reaching cell A200001, which would be the last hypothetical instance of “Apple”.

In either function, if the cell directly above or below (respectively) the active cell has a different value than the current value, that cell is simply selected.

Progressive Status Indicators in Excel

This is one of those posts that is really only for my own benefit. I recently decided to tackle the idea of rather than just putting red, yellow, and green indicator lights on a spreadsheet to indicate how close we were to meeting budget in certain areas, I would programatically allow Excel to determine how “green” or how “red” we were, with yellow in the middle.

So, in other words, if red is RGB(250,0,0), yellow is RGB(250,250,0), and green is RGB(0,250,0), how do we gradually go from red to green depending on how far away from budget we are? By the way, for those who don’t know, the RGB number format is simply three numbers from 0 to 250 that represent how much of each color red, green, and blue the computer ought to use to compose a color. Adding red light to green light makes yellow light, so RGB(250,250,0) means you want to use pure red and pure green, but no blue in order to get yellow.

My first task was deciding how to measure the distance from budget. After a lot of calculations, I realized it should simply be how far from 100% of the budget we were. That leaves a scale of 0 to 100. At 100, we want to just display green, and at 0 we want red.

My second task was fading from red to green by going through yellow. My best solution was to split the task in half. If the actual value was less than 50% of the goal, we would worry about fading from red to yellow (we would only need to mess with the green color). If the actual value was more than 50%, we would fade from yellow to green (only mess with the red color). In other words: If 50% of budget, green = 250, red = calculated value. Blue is always 0.

Since there are 250 steps of color in the RGB scale, but only 125 for each half (red to yellow, yellow to green), then for every percent of budget, we want to add or remove 5 steps of color.

That’s all I need to explain, I think, in order to remember this later. If you have any questions, I will gladly answer them. What follows are two real-world examples taken straight from my current project.

Sub looper()

    Dim mydiff As Double
    Dim i As Integer

    For i = 5 To 10
        mydiff = (Cells(i, 23) - Cells(i, 24)) / Cells(i, 24)
        Call update_indicator("shp" & i - 4, mydiff)
    Next i

End Sub

Sub update_indicator(strShape As String, dblVar As Double)

    Dim intR As Integer
    Dim intG As Integer
    Dim intB As Integer

    intB = 0

    If dblVar < -0.5 Then
        intR = 250
        intG = 250 - (((Abs(dblVar) * 100) - 50) * 5)
    ElseIf dblVar < 0 Then
        intG = 250
        intR = 0 + ((Abs(dblVar) * 100) * 5)
    Else
        intG = 250
        intR = 0
    End If

    ActiveSheet.Shapes(strShape).Select
    Selection.ShapeRange.Fill.ForeColor.RGB = RGB(intR, intG, intB)

End Sub

Getting Information From the Web Using Excel VBA

What we’re trying to do today is get a value from a webpage and return it to a custom function within Excel. I’m going to use a very simple example, and it should be inferred that this is only for very simple tasks. The information you seek to return to your function will need to be presented on the webpage the same way every time you visit it, because all you’re doing is skipping the web browser, bringing the background contents of the webpage into a string, and parsing the string to pull out the data you want.

What I mean by the “background contents” is the code behind all webpages. When your browser calls up a URL, it receives a bunch of code, which it uses to render the page to you. If you’re unfamiliar with HTML code, you may have a bit more difficulty with this, but all you really need to be able to do is recognize where your information is always kept within this code. This sounds more complicated than it is, really.

For my example, I want to create a function called “temperature” to which I can pass a city and state, and from which I can gather the current temperature of that city. I live in Orlando, FL, so that is the temperature I want to know.

Important: The first stumbling block in this project is the need for a special reference from within Excel VBA — the Microsoft Internet Transfer Control reference. You may need to download msinet.ocx (you can find it pretty easily through popular search engines) in order to add this reference. To add a reference, click Tools -> References from within the VBA window, locate the reference, and select it. If the reference is not available, download msinet.ocx, put it somewhere easy like \Windows\System, and then locate the file via “Browse…” from the references window.

The code for my example function follows, and I will explain everything in the following paragraphs:

Function temperature(strCity as string)

strCity = Replace(strCity, ", ", "%2C+")

  Dim myURL As String
myURL="http://search.msn.com/results.aspx?q=current+temperature+" _
& strCity

  Dim inet1 as inet
Dim mypage As String

  Set inet1 = New Inet
With inet1
.Protocol = icHTTP
.URL = myURL
mypage = .OpenURL(.URL, icString)
End With
Set inet1 = nothing

  Dim intStart As Integer, intEnd As Integer
intStart = InStr(mypage, "<span class=""wea_temp""></span>") + 22
intEnd = InStr(intStart, mypage, "&")

  temperature = mid(mypage, intStart, (intEnd - intStart))

End Function

I’ll take you line-by-line through this somewhat lengthy example function. To begin, the name “temperature” and the parameter strCity means that when you input this function into a cell within Excel, you would type “=temperature” followed by the city and state in quotes and parenthesis, i.e. ‘=temperature(“Orlando, FL”)’.

The second line takes strCity and replaces the comma with “%2C+”. The reasoning for this is to properly construct the URL. In the language of a URL, %2C means “comma”, and the + denotes a space. You can’t have spaces or punctuation marks in URLs.

Then we declare myURL, which is what we would have typed into the web browser. If you were to go to msn.com and search for something (like ‘current temperature Orlando, FL’), then looked at the address bar, you would see something like this. In fact, that’s how I constructed the string, and that’s how I urge you to go about this as well. We add our city and state – gathered from the function parameter – to the end of the string.

The next part gets a little vague for me, because I haven’t had much experience with the Internet Transfer Control. Like most controls, you need to first point a variable to it (“Dim inet1 as inet”), and then initialize it (“Set inet1 = New Inet”). In our example, we need to worry about three aspects of the Inet control – Protocol, URL, and OpenURL. The first two are parameters, and in our case will be icHTTP for the protocol, and the URL we have already constructed as the URL.

The third (OpenURL) is the method we’re using to hit the URL. The two parameters you need to pass to it are the URL to hit and what to do with the result. Since we’re only interested in the webpage itself and not any files that might be downloaded, we just want to return the value to a string, hence, we tell it to use the icString data type. icString is a data type unique to the Internet Transfer Control, and unfortunately, I don’t know much about it except that it can be transferred to a standard string data type.

We complete the web hit with the statement “Set inet1 = nothing” in order to clear out our memory. This is just good practice.

This is the fun and somewhat difficult part. If you search in msn for the temperature in Orlando, FL and then take a look at the source of the website through a standard browser, you’ll see a lot of stuff that means nothing to you. Buried in all of this stuff is the current temperature. The easiest thing to do in this case was to search msn, get the actual temperature, and then search for that number in the source. I then noticed that the temperature occurs directly after the first instance of the phrase ‘<span class=”wea_temp”>’. The InStr function tells me what character within the HTML code begins that phrase, and as the phrase is 23 characters long, we add 22 to that number. Immediately after the temperature was the character “&”, so we perform another InStr function (passing our first value as the start point for that function) to get the first instance of “&” following the temperature. The temperature is then recorded using the mid function, which asks for the string, the start value, and the length of the string you’re asking for (which would be the position of the “&” minus the position of the first character of the temperature).

Like I said, that’s the difficult part. You have to be able to read through all that code to find your value, and you have to play around with the InStr and mid functions to get the right numbers. This is also why it’s vital that you only do this with static pages. I know msn.com will always return the temperature of the city and state I search for at the top of the search results, so I am confident that this function won’t break. But if msn.com put the temperature arbitrarily in the list of search results, this would be substantially more difficult.

UPDATE: For a second example of how this might be put to use in order to calculate drive times between two addresses, latitude and longitude pairings, zip codes, etc., see my post https://guj800.wordpress.com/2011/04/27/drive-time-calculations-in-excel/

Create Your Own Excel Add-Ins

Yesterday I showed you how to create your own functions using Excel VBA, or Visual Basic for Applications. Today’s post will take this to the next level.

If you’ve played with creating and using your own functions, you may have noticed that once you close that workbook, you lose the ability to use that function. This is because Excel uses its built-in functions first, then looks to what is saved in the workbooks that are open. Excel doesn’t by nature save VBA code into itself; it only reads what’s been saved into the individual files.

Here’s an example. Say you’re working on a physics project for school. You’re working out of your spiral notebook from which you’ve been working all year. When you need to remember the formula for momentum, you can just flip through your notebook, find where you wrote it down, and use it to calculate the problem you’re currently working on. If you were working in a different book, you’d have to go back to this book to flip through and find the formula. If you didn’t have the book with the formula written on it (assuming you lack the cognitive capacity to memorize it), you wouldn’t be able to solve the problem.

So the excel file in which we saved our CAGR formula yesterday is the spiral notebook you’ve been taking to class every day, and in order to remember how to solve that formula, Excel needs to keep the spiral notebook open.

But, there is a way to make Excel memorize the formula. It is done with add-ins.

Generally speaking, when a program accepts add-ins, they have to be programmed using a higher-level programming code than VBA, and they have to be compiled in a very specific manor such that the host program is able to read and employ them. In the case of Excel, you are able to create your own add-ins just by saving a certain Excel file as a “Microsoft Excel Add-In (*.xla)”, which is the last type in the “Save As Type” drop-down when saving a file.

The methodology here is that any VBA code you write – be it custom functions or macros – will be saved in this .xla file, and when you include it as an add-in for Excel…wait, I’m getting ahead of myself.

Let’s start from the beginning. Go ahead an open up a new spreadsheet in Excel. Don’t worry about the cells within this worksheet. You can type whatever you want there, or you can type nothing at all. What you want to focus on is the VBA window.

On the menu, select Tools -> Macro -> Visual Basic Editor. You want a spot to type stuff, right? So click Insert -> Module. Here is our blank canvas where we enter in our custom formulas and whatnot. We’ll use a simple formula for today’s example. Type this into your module:

Function stupid_formula (myinput as integer)
   stupid_formula = myinput ^ 3
End Function

If you can’t tell, all this function will do is cube what you pass to it. If you were to close this window out, go back into your spreadsheet and type “=stupid_formula(2)” into a cell and hit enter, the cell’s value would be 8. However, if you were to close this workbook, open a new workbook, and type that formula in, the cell’s value would be “#NAME?” because Excel already forgot what you wrote in it’s spiral notebook.

Ok, so you’ve written “stupid_formula” in your module, and you haven’t closed your Excel file. Here’s where the fun starts. Close your module window to get back to the spreadsheet with which you began this journey. Save it (via File -> Save) as a “Microsoft Excel Add-In”. Once you select that as the file type, Excel will default to the “Add-Ins” directory, which is probably a pretty good place to save these. For this post, I’m going to save mine as “guj_formulas.xla”.

Now we just need to register our file as an add-in. This will tell Excel to open this file each time Excel is started. So to follow through with the previous metaphor, every time Excel gets ready to work, it grabs its trusty spiral notebook with all of your formulas in it.

To do this, go to Tools -> Add-Ins…

Click “Browse”, and double click your file. It adds it to the list of Add-Ins, and goes ahead and checks it for you.

And that’s it! Every time you open excel you can use the “stupid_forumla” formula. Alternatively, you can enter the VBA window at any time to view the formulas you have saved as an add-in, and alter or add to them. You will need to save your work using the save function on the VBA window in order to realize these changes in future instances of Excel.

Have fun with it, and remember, Excel works for you, not the other way around.

Create Your Own Excel Formulas

Having just completed a very large project at work involving six horrendous weeks of brutal data pulling, formatting, and analysis, and resulting in two 400+ page books and two 150+ page books of stuff that nobody will ever really care about, I am now free to write a new post! Today’s topic is creating custom formulas for Excel.

Excel is nothing if not configurable, but few people have the technical acumen, the desire, or the time to spend customizing it. Hopefully, this post will help to boost the technical acumen of the reader, but as for the desire and time, you will have to be the judge. I think you will find that performing some simple tasks such as this will free up much more time in the future, and such should feed your desire.

Ok, let’s get crackin’. The example I’m going to use throughout this tutorial is for a formula which we use frequently around the office, but of which few people are aware. It is for the Compounded Annual Growth Rate, or CAGR. When you have two figures occuring in different years, say, the population of a county in Florida, and you want to know how much this figure grew or will grow in each year in between, you use the CAGR.

Now I know what you’re thinking. The growth rate should be as simple as taking the quotient of the the difference between the first and last numbers over the first number, i.e. (xy)/x, and then dividing the result by the number of years in between.

What that will give you is, in fact, a simple solution to this problem. If the population of Orange County is 100,000 in 2007 and is slated to be 200,000 in 2012, then it will grow (200,000-100,000)/100,000, or 100%.  Divide this by the number of years, and you get 100%/5, or 20% per year. However, if the population truly grew 20% per year, then the interest would compound each year, resulting in a final population of 248,832. This is the same way credit card companies make money. This is also the way anyone can make money just by having money.

Anyway, what we want to get at is what percentage of the total population would grow each year in order to reach 100% in five years. This is calculated using the CAGR. The formula looks like this:

CAGR(t0,tn) = (V(tn) / V(t0))^(1 / (tn – t0)) – 1

Where V(t0) is the start value, V(tn) is the finish value, and (tn – t0) is the number of years in between.

Written as a function, accepting start_value, finish_value, and number_years, it would look like this:

CAGR = ((finish_value / start_value)^(1 / number_years)) – 1

Sort of a mouthful, and quite a bit to remember. Even when memorized, it’s quite a bit to have to type every time you want to use the formula. It would be nice of Excel to include this as a standard formula, but they don’t, and that’s probably because only a small amount of users would ever need it. There are probably thousands of formulas that some people use every day that I will never even need to know. That’s why Excel is customizable. So, on to the point of all of this.

Wouldn’t it be nice if instead of writing that formula over and over again, one could just type

=CAGR(A1,B1,5)

into a cell and get the yearly growth rate? Well, you can! And you do it using Excel VBA, which stands for Visual Basic for Applications. Visual Basic is a programming language, but as the name implies, it is very simple and intended for the average or slightly above-average user. Once you understand some of the basic syntax, you can do just about anything.

You get to the VBA section of Excel by clicking on Tools -> Macro -> Visual Basic Editor. Once there, you need to create a module, or a file containing programming code, in your workbook. Do this by clicking Insert -> Module. You now have a blank canvas in front of you, and you are ready to create magic.

For our CAGR formula, type the following into this blank canvas:

Function CAGR (start_value as double, end_value as double, num_years as integer) AS double
   CAGR=((end_value / start_value)^(1/num_years))-1
End Function

What does this mean? Well, the first line tells VBA what the name of your function is, and what parameters, or variables, to expect. We know that in our calculation of CAGR, we need to know both values and the number of years that separate them. So we tell VBA that we will be using these values. The first two are accepted as double, which means they are registered as double-precision floating point value, which is a fancy way of saying the number can be really big, or with a lot of numbers after the decimal point. It’s the biggest number type that VBA can work with, and it’s probably way more than most people will ever need. The number of years is registered as an integer, which means it can be any number up to 32,767 and can only contain whole numbers (it will be rounded if necessary). This should be more than sufficient for any CAGR calculations we will do.

The second line uses the name of the function (CAGR) and defines its value based on the parameters. Since it is the only line within our function, the last line (“End Function”) tells VBA to stop working, and just return the value of CAGR to the cell in which the function was entered.

Just close your VBA window to get back into Excel and start to have fun! Create a spreadsheet like this:

Then type in the formula to calculate the five-year growth rate (btw, this could be stored as a function in VBA as well, but you will have to name it something other than “growth” because Excel has a built-in function with that name):

Then type in our CAGR function in the adjactent cell:

And hit enter. Ta-da! We have our calculation!

Turns out it would actually grow 14.9% per year, and after five years, it will have grown 100%.

That’s it for now. The thing to remember about these saved formulas is that they’re only available in the workbook in which they are saved (or in other workbooks so long as that workbook is open), so you’ll have to recreate the VBA code each time you want to use it (however, once it is calculated, the value will remain in the cell whether or not the function is present). In the next post, I’ll show you how to save your VBA code as an Excel plug-in, so it will be available every time you open Excel.