How to get the cartesian product in Excel
Imagine you have two lists of data in Excel, perhaps a list of products and a list of colors, and you need to generate all possible combinations of each product with each color. This is precisely what the Cartesian product achieves. It’s a fundamental concept in mathematics and can be incredibly useful in Excel for tasks like creating test cases, generating all possible scenarios, or combining different sets of options.
Let’s say you have your list of products in column A, starting from cell A1, and your list of colors in column B, also starting from cell B1. To obtain the Cartesian product, we will leverage a combination of Excel functions. First, you need to determine the total number of rows your resulting Cartesian product will have. This is simply the number of products multiplied by the number of colors.
Now, in a new column, say column C, starting from cell C1, we will enter a formula to extract the correct product for each row of the Cartesian product. The formula we will use is INDEX(A:A, 1 + MOD(ROW() – 1, COUNTA(A:A))). Let’s break this down. COUNTA(A:A) counts the number of non-empty cells in column A, giving us the total number of products. ROW() returns the current row number. ROW() – 1 adjusts the row number to start from zero for the first row of our result. MOD(ROW() – 1, COUNTA(A:A)) calculates the remainder when the adjusted row number is divided by the number of products. This ensures that the product list cycles through for each color. Adding 1 to this result gives us the correct index to use with the INDEX function. INDEX(A:A, …) then returns the product from column A at the calculated index.
Next, in the column adjacent to the product column, say column D, starting from cell D1, we will enter a formula to extract the corresponding color for each combination. The formula for this is INDEX(B:B, 1 + INT((ROW() – 1) / COUNTA(A:A))). Again, let’s dissect this. COUNTA(A:A) gives us the number of products. ROW() – 1 adjusts the row number. (ROW() – 1) / COUNTA(A:A) divides the adjusted row number by the number of products. INT(…) takes the integer part of this division. This ensures that each color is repeated for the entire set of products. Adding 1 to this result gives us the correct index for the INDEX function. Finally, INDEX(B:B, …) returns the color from column B at the calculated index.
After entering these formulas in the first row (C1 and D1), you need to drag these formulas down to as many rows as the total number of combinations you calculated earlier (number of products multiplied by the number of colors). Once you have dragged the formulas down, columns C and D will contain the complete Cartesian product of your product list and your color list. Each row will represent a unique combination of a product and a color. You can then copy the values in columns C and D and paste them as values in a new location if you no longer need the formulas. This method provides a dynamic way to generate all possible pairings between two sets of data in Excel without resorting to complex programming or external tools.
Leave a Reply