Showing posts with label MID. Show all posts
Showing posts with label MID. Show all posts

Extracting numbers from the middle of a mixed string (text and numbers)

We often asked to make sense of imported data. Perhaps you are asked to extract information from within a string of data. Now in another post I showed how to extract numbers from the front of a string.
However, in this example I was asked to extract the middle numbers (starting with a 1 and going 5 digits). Well I broke out my mad crazy Excel skills and after an hour I came up with a solution utilizing 2 array functions, a Left, Mid, Find and Right function. I was smugly pleased with my result but it was not eloquent even though it solved my problem.
Line Description
85363D ATLANTA    18545  785050    369634011-10-05
85364D FLORIDA    11430  785050    369662011-10-05
85365D NY         11928  785050    363672011-10-05
85366D COLUS      13535  785050    810142011-10-05
85367D AMSTERDAM  14089  785050    810152011-10-05
85368D NEVA       14331  785050    369712011-10-05
85340D GEORGIA    14331  785050    369702011-10-05
If I had taken a moment to change the font to a true type (such as courier), I would have seen the following.
Line Description
85363D ATLANTA    18545  785050    369634011-10-05
85364D FLORIDA    11430  785050    369662011-10-05
85365D NEW YORK   11928  785050    363672011-10-05
85366D COLUMBUS   13535  785050    810142011-10-05
85367D AMSTERDAM  14089  785050    810152011-10-05
85368D NEVADA     14331  785050    369712011-10-05
85340D GEORGIA    14331  785050    369702011-10-05
With a courier font, all the columns that at first appeared to not line up, in fact did line up. In addition, in  every row that I needed, there was a common denominator (785050) (The above data is just a sample of the rows I needed - they were not grouped together in the original data). To use an old saying, "I was lost in the tall grass and could not see the obvious". This one column data that I was looking at was in a grouping of about 20 other columns and the outside distraction of those other columns helped obfuscate the pattern.
Well with this clear data I could scrap all my fancy array formulas and simple extract the data with the following code.
=IFERROR(MID(A2,(FIND(785050,A2,1))-7,5),"Not found")
I added the IFERROR because there were some records that did not have 785050.
So what does this formula do?
Well…
FIND(785050,A2,1)  - checks to see if the number 785050 is in my string that I am searching
MID(A2(  - uses the FIND to return the string I am looking for.
IFERROR – used to avoid #Error or N/A errors.
-7  - since I want to return the 5 digit number and I located the starting position of 785050, I needed to look to the left 7 characters to get my starting point.
Another way to get my answer would be to break out my data with Text To Columns. Since my data is aligned (and the record layout is apparent when I converted the text to courier), Text To Columns is another great way to break out my data without coding.
So what's the lesson to learn here? First always examine your data before you begin. You want to always compare apples to apples so convert the data to a true type font such as courier to help you identify patterns. 



















The other day I was asked if I could break out numbers from a cell that contained both text and numbers. My manager wanted the numbers in one column and the text to the right in the adjacent column.

I thought sure, easy to do with a combination of =Left =Mid and = Right functions. However, when I looked at the data I noticed that the numbers did not start in the same position in the text. I also noticed that some cells had 6 numbers while others had 4. To make things even more difficult, some cells started with numbers first.

Well I thought about the problem and decided to solve it by converting the formula into an array.

What’s an array formula?

An array formula is a formula that works with array data values rather than a single data value. 

There are two different types of array formulas
1.     Formulas that work with an array or series of data and aggregate the result. Sum, Average and Count are good examples of this type as they all return a single value to a single cell.

2.     Formulas that returns an array of numbers as its result. These array formulas are entered into multiple cells that are then treated as a group. For example, =ROW(A1:E20).

In the above example I have my original data in column A. I first use the formula in column B to extract the numbers in the string. After entering the formula, I hold down the control key, shift key and enter key all at the same time to convert the formula into an array.

Once I had that, it was a simple matter of getting the length of how many numbers were in the original data and the starting position of the numbers in the string (Columns C and D).

Based on that, I could then parse out the text to the right of the number (Column E).

Now to accomplish all this I needed to use 7 functions….

=IF
=MID
=MATCH
=ROW
=LEN
=COUNT
=ISNUMBER

All of these functions are simple to use but the trick was combining them to get my solution.

Indeed knowing what functions are available in Excel will allow you to join them to solve almost any problem you have to solve. So it’s a good idea to learn them or at least know they exist so that you can reference them.

{=1*MID(A2,MATCH(TRUE,ISNUMBER(1*MID(A2,ROW($1:$100),1)),0),COUNT(1*MID(A2,ROW($1:$100),1)))}

{=MATCH(TRUE,ISNUMBER(1*MID(A2,ROW($2:$9),1)),0)}

{=IF(MATCH(TRUE,ISNUMBER(1*MID(A2,ROW($2:$9),1)),0) = 1,MID(A2,(E2+D2),90),MID(A2,(E2+D2+1),90))}


How would you have solved the above problem?

Separating First and Last Names

One of the more common tasks you may come across is breaking out data from inside a cell. I am often asked to separate first and last names that are joined together in one cell. In this example I show a simple way to separate first and last names. I use a series of formulas to accomplish this task, FIND, MID, LEFT, LEN, TRIM and ISERROR.











Last name formula used in Column B
=LEFT(A2,IF(ISERROR(FIND(",",A2,1)),LEN(A2),FIND(",",A2,1)-1))

First name formula used in Column C

=TRIM(IF(ISERROR(FIND(",",A2,1)),A2,MID(A2,FIND(",",A2,1)+1, IF(ISERROR(FIND(" ",A2,FIND(",",A2,1)+2)),LEN(A2), FIND(" ",A2,FIND(",",A2,1)+2))-FIND(",",A2,1))))

How do you accomplish this task?