Building a database of line scores from Retrosheet data (which is simple) but I also intend to include fields for net score after each inning (Hscore - Vscore). From that I want to look at when teams actually win or lose games. Poking around the web led to a MATCH-INDEX formula that works. Except...I realized that a team could hold the lead, lose it, and then regain it, so the net score could go like 0,+2,-1,+1,-2,+5...., and the formula would return +2 when the answer should be +5. Any ideas? I'm currently in a forest/trees state.

Is there no other way to derive the win? edit: removed previous post. How is net breaking down? Score tied, home team winning by two, home team losing by 1, home team winning by one, home team losing by 2, home team winning by 5?

It would be helpful to see an example of what you're talking about. Post an example to Google Sheets?

i'm not following. I assume you're creating fields like ScoreAfter1, ScoreAfter2 ... 9, by doing the sum of the inning scores for home team (up through that inning) minus the inning scores of the visiting team. There's no reason that should only return the final score of the game. What's going wrong?

I think he's looking at the point where the lead is gained and never lost. That means you need to work backwards in your data and look at the first time the lead goes to zero or negative runs. Then add an inning.

Yes, I'm looking for the inning in which the winning team has scored enough runs to win the game, regardless of whether or not it scores more runs. I thought about doing it backwards but then I got preoccupied with extra-inning games and ones that were shorter than nine. I'll have to think about that some more. [or lose games] Retrosheet's line scores are complete back through the 1906 season and I think it would be interesting to see if/how scoring has changed through different eras. Thanks.

Dedicate a column to inning numbers and just return that value rather than the count. The total number of innings doesn't matter.

Not sure I understand. I have a table with individual innings for the visiting and home teams and then another table with the difference in scores after each inning. Vinn1/Vinn2...; Hinn1/Hinn2... and AFT1/AFT2.... One of the AFT columns in a row will have the number I want but I haven't figured out how to get Excel to recognize that for each game. If I could dedicate a column and fill it in, I wouldn't have my problem . Maybe I'll try flow charting it.

I can think of two ways to do this. I think index-match-index would work or an array index match. See here: https://exceljet.net/formula/index-and-match-with-multiple-criteria The other way is to create a text string of the net score after each inning using text values in reverse order. So a 5 inning game that went 1,-2,0,-1,+2 would be coded as +1,-2, +-0 , -1, +2 would concatenate to "+2-1+-0-2+1" in a given cell. For this to work, every cell must be in text format. In my spreadsheet this reverse order value is in cell G1. Then use this formula on the concatenated value: =IF(ISERROR(ROUNDDOWN(IF(LEFT(G1,1)="-",(SEARCH("+",G1)-1)/2,(SEARCH("-",G1)-1)/2),0))=TRUE,"Firstinning",ROUNDDOWN(IF(LEFT(G1,1)="-",(SEARCH("+",G1)-1)/2,(SEARCH("-",G1)-1)/2),0)) The essence of the formula is this: =IF(LEFT(G1,1)="-",(SEARCH("+",G1)-1)/2,(SEARCH("-",G1)-1)/2) Where if the first character of the string (which is the last inning) is a "-" then it searches the string for the first "+" and vice versa searching for "-" where the last inning is a "+". Since the concatenated string is the game by inning in reverse order, this searches backwards through the game for the last time the losing team led the game. Subtracting 1 and dividing by 2 will get you how many innings back from the end of the game the final lead change occurred. Should be simple from there to translate that to an inning based on how long the game was. I've used rounddown to get rid of half innings caused by the zero having both signs and iserror to get rid of problems where a team takes the lead in the first inning and never gives it up. In that case it returns "firstinning". I'm not sure if I totally understand your problem, but I think the reverse concatenate here is the key to getting the "+5" instead of the "+2" in the example from your first post.

If I'm understanding you correctly, you want to find the last inning (= last position in the column) where the net score is <= 0, and then add 1, to get to the first inning where the lead stays positive. I did an internet search and found https://www.extendoffice.com/documents/excel/2681-excel-find-the-first-last-positive-number.html, for example.

Let me try explaining what I want to do one more time. Suppose the visiting team does not score in the top of the 1st but the home team scores 4 runs in the bottom of the 1st. Now, if the visiting team's final score is 3 runs, then effectively the home team won the game in the bottom of the 1st. The problem I am having with this is when the lead changes hands during a game. If a game goes into the bottom of the 9th or into extra-innings, you know when the winning run was scored and you also know in shut-outs; it's the other games. MATCH would do what I want except the array searched has to be sorted in ascending or ascending order.

I know excel, I love excel, but people over-obsess about trying to make Excel do things that are algorithmically complex enough that it's no longer the right tool for the job. This sort of thing would be like 20-30 lines of code in PHP or Python or whatever your scripting language of choice would be. (or 200 lines of buggy VB, but leave that aside) If you can represent this logic as a bunch of for() loops and if()-else() statements, you should just code it as such.

Can you share a small extract? I think we could probably do it with some nested formulas. Like, if the losing team scores 4 runs, you want to identify the first inning that the winning team had accumulated 5 runs. It should be do-able.

So out of curiosity I was playing with this quickly, and it's really getting the "when did they win?" to populate, which I'm trying to figure out. I mean, obviously they won in the 7th when that tie turned into an "X", but I'm just trying to figure out the best way to do a nested IF playing off if they got the W or not, and have it populate the inning that the game was won. edit - basically need something that looks for, and returns the last instance of "Tie" turning into "X" and that should always be the inning that the game was won, even if it was the first. In my example it's the 7th.

I see what you're trying to do and I think you're going about it in a convoluted way. If a team "wins" the game by scoring 4 runs at any point in the game when the other team scores 3, then you need to take the losing team's final run total and find the inning in which the winning team exceeded it. Lead changes are irrelevant in your definition.

I've been hoping to stay away from nested IFs. I think you need to look at the point where the losing team has scored its final runs then find the inning when the winning team exceeds that total. I've got something I'm going to try but I have to think about it some more.

Seriously. Come ask for help. Multiple people want to help. Multiple people ask for data. Multiple responses from OP w/ no data.

Does a team effectively win when they take the lead for good, or when they surpass the other teamâ€™s final run total? The link I posted gives at least one way to find the last place in a column where a condition is met, whether thatâ€™s the last time before team A took the lead for good, or the last time before team Aâ€™s run total became greater than team Bâ€™s final run total.

My idea of what you wanted was right. The above formula I posted works for what you have described here. You would just have to code with pluses and minuses which might be a pain. Itâ€™s convoluted to do this in excel imho and maybe a different tool would work better.

I may have it figured out but I have to run a bunch more data through it. MATCH returns the location in a data array but it wants the data in ascending order and then returns the first, not last. It's going to be a pain because I'll have something like a million records to process. I'll post some data and the results when I get done.

We're asking for a small subset of the records so we can look at it and try to offer a solution, but I'm also interested in seeing the result, sure.

Wait, you're using Excel for a million rows of data? InstaFace has it right; you're using the wrong tool for this problem. You must not be working with the entire data set, otherwise I would have expected a crash by now.

No, I'm downloading comma-delimited files by decade (well, the years are zipped by decade) into Excel to make tables for an Access database.

You can't reverse your array, simply by having a row below with A1 = A9, A2 = A8, etc., and then run the MATCH on that?

The syntax of MATCH is MATCH(lookup_value, lookup_array, [match_type]), where match_type = 1 --- finds the largest value that is less than or equal to the lookup_value 0 --- finds first value identical to lookup_value -1-- finds the smallest value that is greater than or equal to the lookup_value Finding a match_value equal to the lookup_value is a problem as it may find an inning in which the score is tied or the first inning in which identical scores were derived. This is some data:

So I can do some dumb nest If statements! I copied your A1 through R11. In Column T through X I titled them Vscore, Hscore, Needed to win & Winner, respectively. Vscore & Hscore are obvious, Needed to win is: =MIN(T2:U2)+1 Winner is: =IF(T2>U2,"V","H") Win Inning is the stupid one: =IF(W2="H",IF(AH2>=V2,1,IF(AI2>V2,2,IF(AJ2>V2,3,IF(AK2>V2,4,IF(AL2>V2,5,IF(AM2>V2,6,IF(AN2>V2,7,IF(AO2>V2,8,9)))))))),IF(Y2>=V2,1,IF(Z2>V2,2,IF(AA2>V2,3,IF(AB2>V2,4,IF(AC2>V2,5,IF(AD2>V2,6,IF(AE2>V2,7,IF(AF2>V2,8,9))))))))) Where columns Y through AG are each the visitor's total runs after each inning 1 through 9 and AI through AP are the home team's total runs after each inning 1 through 9 This only works for 9 inning games, but you said the extra inning part would be easy to append to this. For rows 2 through 11, I came up with the home team winning in innings: 1 9 7 8 7 2 9 6 8 9

No, you need to look at lines 14-23, which show the cumulative scoring by inning, e.g., SUM($A14:A14), ..., SUM($A14:I14)...; SUM($J14:J14),...SUM($J14:R14). Then you find the first inning in which the Home Score in a line is greater than the first highest score in the Visitors line. I filled in those in the Home Score line in red (and they are listed numerically in column T opposite the first tables).

Im pretty sure my formulas did that without creating extra rows, but making extra columns instead. Edit: I mean, did you get different results than I did?

He has his results above and yes he got different results. 4th game for example is not won in the 8th as your set says, it's won in the. The 5th game is one in the 3rd, not 7th. I didn't check the rest.

In Row 1, the next-to-last column (Col. T) on the right is Innwon, and that column contains the inning in which each game was won. I originally had some accompanying explanations in the Google Sheet but they caused columns to stretch out to very wide lengths, so I removed them and forgot to add them in my written explanation. Sorry.

I corrected my "Win Inning" fomula below. The problem was I used ">" instead of ">=" several times. I now get all the same results as you: =IF(W2="H",IF(AH2>=V2,1,IF(AI2>=V2,2,IF(AJ2>=V2,3,IF(AK2>=V2,4,IF(AL2>=V2,5,IF(AM2>=V2,6,IF(AN2>=V2,7,IF(AO2>=V2,8,9)))))))),IF(Y2>=V2,1,IF(Z2>=V2,2,IF(AA2>=V2,3,IF(AB2>=V2,4,IF(AC2>=V2,5,IF(AD2>=V2,6,IF(AE2>=V2,7,IF(AF2>=V2,8,9)))))))))

As I remarked earlier, I was hoping to avoid nested IF statements. I thought there might be some Excel "trick" that I was overlooking. But, thanks.

Yeah, once upon a time. I was quite good at programming...Fortran with assembly language subroutines, but that was a long time ago and I haven't done any programming since MS-Dos days when Basic was included, so it might be a little more for me than just scribbling out 20-30 lines of code. Admittedly, most people who use Retrosheet data for analysis do use code. And just as an aside to all this, each record in a Retrosheet's Game Log has 161 fields, of which only about 10 are needed for this study, and the main two--visitor and home inning-by-inning scoring--are presented as words--102300[12]00, where [12] would be a 12-run inning by one team. So, your script would either have to separate each of those words into columns by innings (variable in game length) and you can do what I do and open the comma-delimited file in Excel, deleted the unneeded fields, copy the scoring word fields to my text editor and run a Regular Expressions/Perl routine to convert then to comma-delimited, and re-copy them back to Excel, or if you are really good, then maybe you can do all of it by coding.

that sounds like it would be 3-5x simpler by coding, in terms of total time spent getting the output you want. You can then format it using Excel if you like but at that point it's trivial. If you used to code, you've crossed the hardest barrier which is learning to think like a coder. If this is something you're spending your free time on, your time will be well rewarded by chugging through some basic Python documentation. I mean, if I have a string input like "102300[12]00", parsing that into an inningScore[x] array (in PHP) or List/Dict (Python) would seriously be like 10 lines of code max. Remember, a String is a sequence, every character is an entry in that sequence. Since I can assume that there are no 3+ digit inning scores, I can take the shortcut that anytime we hit brackets, the next two characters are the inning score (but making it generalized wouldn't be much harder). Code: i = 0 inningScore = [] # NB - zero-based array, will have indices 0-8 while i < len(gameString): if gameString[i] == '[': inningScore.append( int(gameString[i+1] + gameString[i+2]) ) i += 3 # move past the close bracket else: inningScore.append( int(gameString) ) i += 1 This really is beginner-level stuff for real coding. You're frankly doing a lot more mental work trying to shoehorn this into Excel than you would by just biting the bullet and changing toolkits, even if it didn't feel that way at the start.

As I noted, I already have a mechanism for parsing the innings score words. I simply replace all inning scores like [12] with a letter unique to the score (102300[12]00 becomes 102300c00, insert a semi-colon after each letter/number, then replace the letters with their corresponding numbers. I use NoteTab Pro as a text editor and it has its own programming language built in, largely based on Regular Expressions. It also can act as a clipboard, so I can turn that on, select a column in Excel and hit Ctrl+C. Now when I switch to NoteTab, the parser I wrote will process several thosand lines in a second or two and I can just paste it back into Excel. As for using Excel, I also noted that each record from a Retrosheet Game Log will have 161 fields in it, of which I have to delete about 150. The data I get is comma-delimited (.csv) and the easiest way I know how to deal with that is simply opening it up in Excel. That also makes it easy to decline importing unwanted fields. Once all that was done, I figured there must be some reasonable way to manipulate the data (inning-by-inning scores) in Excel to get what I need. Unfortunately, I haven't succeeded in that part. This is a link to the Retrosheet page with Game Log data (and that also includes a link to a Guide to Retrosheet Game Logs). Data can be downlaoded by year, decades, or en masse. https://www.retrosheet.org/gamelogs/index.html Thanks

Why avoid the nested IF, if your desired platform is excel? Write the formula once and be done. Also, using excel to get rid of the fields your don't want becomes tiresome. You could easily use many scripting languages to read in the entire file, it out the desired fields in a new file. Anything SQL based would be a breeze once set up. Push a button and your file is transformed. Could nest the if() in there too.

I was avoiding nested IF statements because I thought (was hoping) there was a saner way of accomplishing what I wanted in Excel. My knowledge of SQL is very limited and I would need to spend time becoming proficient in that as well as whatever scripting language worked with it, Windows 8.1, and MS Access (or learn a new free database system). Since the only solution this group has come up with for Excel is nested IFs, then I guess I'll be following that route. But, thanks for the idea.

I used to be an extremely proficient SAS (Statistical Analysis System) coder - it's like SQL in some ways, like FORTRAN in others, with built-in scientific and statistical functions. Very easy to learn, but.... it's not free. Or cheap, if you're using it commercially --- www.sas.com

While it might be good, this is a one-time non-commercial project and I guess it will be Excel nested IFs. Thanks