How to slice data in Excel without VBA using the INDEX function
Slicing a two-dimensional data range is now possible in Excel thanks to Dynamic Arrays introduced in 2018. Throughout this series I will demonstrate how to use built-in functions such as INDEX, TAKE, OFFSET, CHOOSEROWS and CHOOSECOLS to slice data like in Python, MATLAB, etc. without writing a single line of VBA code.
This tutorial focuses on how to use the build-in Excel function INDEX to slice data in Excel. Then we will use the LAMBDA function to automate the slicing functionality to achieve a similar look and feel to Python, etc.
If you would rather prefer to skip all the explanations and jump straight to the formula, then please go directly to the summary! 🚀
Table of Contents
· What does it mean to “slice” data?
· Part I — Getting familiar with the INDEX function
∘ Example — Return data from row 2 and column 4
∘ Example — Return the entire row 2
∘ Example — Return the entire column 4
∘ Example — Return the whole source array
∘ Example — Return rows 1 to 3, columns 4 to 6
∘ Example — Return the centre two rows and columns
∘ Example — Return the first 3 rows
∘ Example — Return columns 2 to 5
∘ Using the INDEX and SEQUENCE functions in combination
∘ Example using SEQUENCE — Return the first 3 rows
∘ Example using SEQUENCE — Return the columns 2 to 5
∘ Example using SEQUENCE— Return rows 1 to 3, columns 4 to 6
· Part II — Automating the slice function
∘ Basic functionality replication
∘ Advanced functionality replication
∘ Storing the functionality as a LAMBDA function
· Summary and the revised formulas
∘ Further improvements
· Bonus — Return slice as a flat array
· Downloads
What does it mean to “slice” data?
To “slice” data simply means to select a specific range or subset of data to be returned back to the user for a particular data analysis purpose.
The idea of slicing can be easily shown in GNU Octave Online at https://octave-online.net/ without having to install any software. Write the following code in the Octave Command Prompt to generate a 6-by-6 matrix containing integers from 1 to 36:
data = reshape(1:36, [6, 6]).'
To slice this matrix all you have to do is specify which row and column ranges you would like returned. For example, the following command would return rows 1 to 3 and columns 4 to 6:
data(1:3, 4:6)
If you are more comfortable with Python, then try the following code in your favourite IDE or try it online at https://jupyter.org/try in a new Jupyter Notebook:
import numpy as np
data = np.reshape(np.arange(1, 37), (6, 6))
print(data)
Note that indexing in Python starts at 0 and the last index is not included. To call rows 1 to 3, one would have to call index 0 to 3 where row index 3 is excluded. To return the same slice of data as above, run the command:
data[:3, 3:]
This slicing functionality is what we will try to replicate in Excel without writing a UDF in VBA. We will use built-in functions only and a little bit of creativity.
Part I — Getting familiar with the INDEX function
The INDEX function is a very versatile Excel function. Its original use was to simply return a value from a range for a certain location. Combining the INDEX function with the MATCH function and multiple criteria as in this EXCELJET tutorial, we get a very powerful and flexible lookup.
With the introduction of dynamic array functions in September 2018, the INDEX became even more powerful. It can now return dynamic arrays too. These kind of formulas “spill” automatically in the neighbouring blank cells which make the legacy Ctrl+Shift+Enter
(CSE) array formulas redundant.
The syntax for this function is deceptively simple, =INDEX(array, row_num, [column_num])
with the following inputs:
- array is the required range of data. It important to note that if the array is a column or row of data the INDEX function will return a single value. If the array has more than one row and more than one column then it returns an array. Keep this in mind as we will make use of it.
- row_num is required unless column_num is present.
- column_num is optional, but it becomes mandatory if row_num is omitted.
In brief, the INDEX returns the data from array that lies at the intersection of row_num and column_num.
Given the following data in an Excel spreadsheet:
Example — Return data from row 2 and column 4
Type the following in another cell =INDEX(A1:F6, 2, 4)
. This returns the value 10 which is at the intersection of the 2nd row and the 4th column.
Example — Return the entire row 2
By setting the row_num
or column_num
to 0, the INDEX function will return the entire row or column respectively. Type the following in another cell =INDEX(A1:F6, 2, 0)
The simplest way to explain setting column_num
to 0 is that the formula is equivalent to pass the set of every single column in the source data range =INDEX(A1:F6, 2, {1, 2, 3, 4, 5, 6})
. Try it!
Example — Return the entire column 4
Like in the example above, we set the row_num
to 0 as in =INDEX(A1:F6, 0, 4)
to return:
The 0 in the row_num
is equivalent to providing the set of every column index as in =INDEX(A1:F6, {1, 2, 3, 4, 5, 6}, 4)
Example — Return the whole source array
If we would like to return the whole original array, then it couldn’t be simpler than =INDEX(A1:F6, 0, 0)
, and it works. Inferring from the explanations above using sets of indices, one would think that the following =INDEX(A1:F6, {1, 2, 3, 4, 5, 6}, {1, 2, 3, 4, 5, 6})
would be equivalent to using the 0 for both row_num
and column_num
. It is not.
Interestingly, this returns the diagonal of the 2D array i.e., the intersection of each row and column index in order. The diagonal of a matrix is an important concept in linear algebra, but I digress.
To achieve the equivalent formula to =INDEX(A1:F6, 0, 0)
, we need to make use of the TRASPOSE function as follows =INDEX(A1:F6, TRANSPOSE({1, 2, 3, 4, 5, 6}), {1, 2, 3, 4, 5, 6})
, which should return the entire source array. Put simply, the TRANSPOSE function converts a row array into a column array and vice versa. Another nod to linear algebra.
Example — Return rows 1 to 3, columns 4 to 6
The set notation becomes particularly handy here. Try the following formula =INDEX(A1:F6, TRANSPOSE({1, 2, 3}), {4, 5, 6})
to return:
Example — Return the centre two rows and columns
Enter the following formula =INDEX(A1:F6, TRANSPOSE({3, 4}), {3, 4})
in another cell to return rows 3 to 4 and columns 3 to 4:
Example — Return the first 3 rows
Unfortunately, mixing the set notation with the 0 notation for row_num
or column_num
does not work as expected if we want to return entire rows and columns. If we try the formula =INDEX(A1:F6, TRANSPOSE({1, 2, 3}), 0)
we do get the elements from the first 3 rows but only from column 1. It is the same as combining apples with pears.
The solution is to use set notation consistently, such as =INDEX(A1:F6, TRANSPOSE({1, 2, 3}), {1, 2, 3, 4, 5, 6})
.
Example — Return columns 2 to 5
Similarly, we can try =INDEX(A1:F6, 0, {2, 3, 4, 5})
. This returns the elements from columns 2 to 5 but from the first row only.
As in the previous example, we need to use the set notation consistently as in =INDEX(A1:F6, TRANSPOSE({1, 2, 3, 4, 5, 6}), {2, 3, 4, 5})
.
Using the INDEX and SEQUENCE functions in combination
The set notation is great. However, to make our “slicing” functionality dynamic we need a way to generate the sets of indices. Thus, we introduce the SEQUENCE function which will aid us in automating the capability to slice data in a more “pythonesque” style.
Note: The SEQUENCE function is available only for Excel 2021, Excel for Microsoft 365 and Excel for the web subscribers.
The syntax for the SEQUENCE function is =SEQUENCE(rows, [columns], [start], [step])
where the inputs are:
- rows is the number of rows to generate. If only this input is passed as an argument then it will generate a column of integers 1 up to the number of rows input, for example
=SEQUENCE(10)
will return numbers 1 to 10. - [columns] is the optional number of columns to generate. The default is value 1. If we input the number of rows and columns, then it will output a 2D array a.k.a. “matrix” containing the numbers 1 up to
rows * columns
. To return the example data from 1 to 36 as given above in the range A1:F6, this can easily be generated as=SEQUENCE(6, 6)
. - [start] changes the starting number from the default of 1. For example, if you would like to start your sequence at the number 10, then simply change the third input argument to 10.
- [step] specifies the step size other than the default 1. For example, if you would like to generate the first ten odd numbers, then run the formula
=SEQUENCE(10, , , 2)
or if you are not comfortable with passing blank arguments you can run=SEQUENCE(10, 1, 1, 2)
.
Example using SEQUENCE — Return the first 3 rows
Combining the INDEX function with the SEQUENCE to achieve the same output as =INDEX(A1:F6, TRANSPOSE({1, 2, 3}), {1, 2, 3, 4, 5, 6})
we simply generate the required sets of indices as follows =INDEX(A1:F6, SEQUENCE(3), SEQUENCE(, 6))
.
The first formula SEQUENCE(3)
returns the column vector TRANSPOSE({1, 2, 3})
. Note that we only need the rows input. However, the usage above would be equivalent to SEQUENCE(3, 1, 1, 1)
.
Likewise, the usage of SEQUENCE(, 6)
would be equivalent to SEQUENCE(1, 6, 1, 1)
. The default values are 1 as mentioned in the syntax.
Example using SEQUENCE — Return the columns 2 to 5
To achieve the same out but as =INDEX(A1:F6, TRANSPOSE({1, 2, 3, 4, 5, 6}), {2, 3, 4, 5})
we could replace the sets with the SEQUENCE function as in =INDEX(A1:F6, SEQUENCE(6), SEQUENCE(, 4, 2))
. Try it out!
The new formula is equivalent to providing the default 1 as in =INDEX(A1:F6, SEQUENCE(6, 1, 1, 1), SEQUENCE(1, 4, 2, 1))
.
Example using SEQUENCE— Return rows 1 to 3, columns 4 to 6
Last but not least, to achieve the same output as the formula =INDEX(A1:F6, TRANSPOSE({1, 2, 3}), {4, 5, 6})
we replace the sets with the following =INDEX(A1:F6, SEQUENCE(3), SEQUENCE(, 3, 4))
.
For completeness, the formula is equivalent to =INDEX(A1:F6, SEQUENCE(3, 1, 1, 1), SEQUENCE(1, 3, 4, 1))
. We will have to be thorough as we have with the complete formulas in order to automate.
Part II — Automating the slice function
Automating the slice functionality can be achieved similarly to the functionality in Python, MATLAB, etc. with a bit of thought and creativity. That is, we would like to construct a named function such as =SLICE(source_data, row_start, row_end, column_start, column_end)
as not to repeat complicated nesting functions in Excel, and perhaps add some validation logic.
We could add a step size as well since the SEQUENCE allowes for it. That is however beyond the purpose of this tutorial.
Warning: The logic will get more complicated from here on! This section is for the adventurous few who enjoy building complicated formulas.
Basic functionality replication
The basic concept has already been demonstrated in the previous three examples, with the SEQUENCE function specifying the indices as =INDEX(source_data, SEQUENCE(num_rows, 1, row_start, 1), SEQUENCE(1, num_cols, column_start, 1))
, where the source_data
is the range A1:F6 in our example:
Here we have the number of rows and number of columns in cells M2 and M4 calculated simply as:
num_rows = J2 - J1 + 1
num_cols = J4 - J3 + 1
The resulting data slice in the dynamic range P1# is the basic formula:
=INDEX(A1:F6, SEQUENCE(M2, 1, J1, 1), SEQUENCE(1, M4, J3, 1))
Note that the step sizes are not necessary in the nested formula, as they do default to 1. We leave them in for now, in case we want to automate the step size in the future.
Advanced functionality replication
Next, we can customize the logic to allow for blank input arguments and for negative arguments indicating the end of the array. This customization should bring our functionality closer to the “pythonic” way.
We first cater for blank input arguments. So if the row_start
is blank the column_start
is left blank then these will get set to 1. If the row_end
is blank then it is set to the number of rows in the source range, and if the column_end
is blank than that will be set to the number of columns in the source range.
Thus, for cell P1’s formula =INDEX(A1:F1, SEQUENCE(M2, 1, M1, 1), SEQUENCE(1, M4, M3, 1))
, we have the following intermediate calculations to be fed in:
where the formulas in column M have been adjusted for blank entries:
row_start
=IF(OR(ISBLANK(J1), TRIM(J1)=""), 1, J1)
num_rows
=IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6), J2) - IF(OR(ISBLANK(J1),
TRIM(J1)=""), 1, J1) + 1
column_start
=IF(OR(ISBLANK(J3), TRIM(J3)=""), 1, J3)
num_cols
=IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6), J4) - IF(OR(ISBLANK(J3),
TRIM(J3)=""), 1, J3) + 1
Next, we add some extra validation for the input arguments and we also check if the end arguments are negative like in Python. For example, if column_end = -1
then it means the user is asking for the last column. If column_end = -2
then it means the user is asking for the second last column, and so on. See the negative arguments in cells J2 and J4:
Column M’s formulas contain added validations for these negative end of the array arguments:
row_start
=IF(OR(ISBLANK(J1), TRIM(J1)=""), 1, J1)
num_rows
=IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6),
IF(AND(J2 < 0, J2 >= -1*ROWS(A1:F6)), ROWS(A1:F6) + J2 + 1,
IF(AND(J2 >= 1, J2 <= ROWS(A1:F6)), J2, NA())))
- IF(OR(ISBLANK(J1), TRIM(J1)=""), 1, J1) + 1
column_start
=IF(OR(ISBLANK(J3), TRIM(J3)=""), 1, J3)
num_cols
=IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6),
IF(AND(J4 < 0, J4 >= -1*COLUMNS(A1:F6)), COLUMNS(A1:F6) + J4 + 1,
IF(AND(J4 >= 1, J4 <= COLUMNS(A1:F6)), J4, NA())))
- IF(OR(ISBLANK(J3), TRIM(J3)=""), 1, J3) + 1
Lastly, we will insert all these intermediate formulas into the general formula =INDEX(A1:F1, SEQUENCE(M2, 1, M1, 1), SEQUENCE(1, M4, M3, 1))
to enable us to automate the slicing functionality with the aid of the LAMBDA function.
Storing the functionality as a LAMBDA function
Note: The LAMBA function is only available for Excel for Microsoft 365 and Excel for the web subscribers.
We will package the general formula in a LAMBDA function for further use in the current spreadsheet or to export it to another Excel file. Quite simply, we substitute the intermediate formulas from references M1 to M4 into the general formula =INDEX(A1:F1, SEQUENCE(M2, 1, M1, 1), SEQUENCE(1, M4, M3, 1))
to obtain this massive formula below.
=INDEX(A1:F6, SEQUENCE(IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6),
IF(AND(J2 < 0, J2 >= -1*ROWS(A1:F6)), ROWS(A1:F6) + J2 + 1,
IF(AND(J2 >= 1, J2 <= ROWS(A1:F6)), J2, NA())))
- IF(OR(ISBLANK(J1), TRIM(J1)=""), 1, J1) + 1, 1,
IF(OR(ISBLANK(J1), TRIM(J1)=""), 1, J1), 1), SEQUENCE(1,
IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6),
IF(AND(J4 < 0, J4 >= -1*COLUMNS(A1:F6)), COLUMNS(A1:F6) + J4 + 1,
IF(AND(J4 >= 1, J4 <= COLUMNS(A1:F6)), J4, NA())))
- IF(OR(ISBLANK(J3), TRIM(J3)=""), 1, J3) + 1,
IF(OR(ISBLANK(J3), TRIM(J3)=""), 1, J3), 1))
Using an editor such as Notepad++ we substitute the cell refences with the parameter names specified at the beginning of Part II, insert the massive formula into the LAMBDA() function to obtain:
=LAMBDA(source_data, row_start, row_end, column_start, column_end,
INDEX(source_data, SEQUENCE(IF(OR(ISBLANK(row_end), TRIM(row_end)=""),
ROWS(source_data), IF(AND(row_end < 0, row_end >= -1*ROWS(source_data)),
ROWS(source_data) + row_end + 1, IF(AND(row_end >= 1,
row_end <= ROWS(source_data)), row_end, NA())))
- IF(OR(ISBLANK(row_start), TRIM(row_start)=""), 1, row_start) + 1, 1,
IF(OR(ISBLANK(row_start), TRIM(row_start)=""), 1, row_start), 1),
SEQUENCE(1, IF(OR(ISBLANK(column_end), TRIM(column_end)=""),
COLUMNS(source_data),
IF(AND(column_end < 0, column_end >= -1*COLUMNS(source_data)),
COLUMNS(source_data) + column_end + 1,
IF(AND(column_end >= 1, column_end <= COLUMNS(source_data)),
column_end, NA())))
- IF(OR(ISBLANK(column_start), TRIM(column_start)=""), 1, column_start) + 1,
IF(OR(ISBLANK(column_start), TRIM(column_start)=""), 1, column_start), 1)))
(A1:F6, J1, J2, J3, J4)
Note the input arguments after the function =LAMBDA(…)(A1:F6, J1, J2, J3, J4)
, those are our inputs into the LAMBDA function. It should look like so:
The last step is to save this massive formula in the Name Manager, which can be found in Formulas ribbon. NB: copy on the formula =LAMBDA(…)
into the “Refers to” input and not the input arguments (A1:F6, J1, J2, J3, J4)
.
And now when we type =SLICE(
, we can see the saved formula:
Which we can then give the inputs that we would require:
We can “export” the named lambda function SLICE()
to another Excel workbook by copying a blank sheet from your current workbook where the LAMBDA named function has been saved to a new workbook.
Summary and the revised formulas
I realise that the formulas presented above are much too complicated and cumbersome. Therefore, I have revised the messy formulas to make them more efficient and easily readable.
The main idea of slicing data in Excel with the INDEX function is this:
=INDEX(source_data, SEQUENCE(num_rows, 1, row_start, 1),
SEQUENCE(1, num_cols, column_start, 1))
where the indexing parameters are as follows:
num_rows = row_end - row_start + 1,
num_cols = column_end - column_start + 1.
The super important conditions for the INDEX function to be able to slice a two-dimensional array are as follows:
- The row indices set has to be a vertical vector of numbers i.e.,
SEQUENCE(num_rows, 1, ...)
; - The column indices set has to be a horizontal vector of numbers i.e.,
SEQUENCE(1, num_cols, ...)
.
The resulting subset is at the intersection of the row and column indices sets.
The first step for automating the 2D array slicer is achieved with the LAMBDA function:
=LAMBDA(
source_data,
row_start,
row_end,
column_start,
column_end,
INDEX(source_data, SEQUENCE(row_end - row_start + 1, 1, row_start, 1),
SEQUENCE(1, column_end - column_start + 1, column_start, 1)))
Then we tidy up the LAMBDA using the LET function by introducing intermediate variables that will hold values and calculations:
=LAMBDA(source_data, row_start, row_end, column_start, column_end,
LET(
num_rows,
row_end - row_start + 1,
num_cols,
column_end - column_start + 1,
INDEX(source_data, SEQUENCE(num_rows, 1, row_start, 1),
SEQUENCE(1, num_cols, column_start, 1))))
Therefore, this is a much cleaner automation of the SLICE function, and it should easily read like the main idea above.
Further improvements
Validation is an important step to writing fail-safe functions. Checking for blank start and end indices, allowing for negative indices and validating that the start index is less than or equal to the end index for both rows and columns can be done as follows:
=LAMBDA(source_data,row_start,row_end,column_start,column_end,
LET(
valid_row_start,
IF(OR(ISBLANK(row_start), TRIM(row_start)=""), 1, row_start),
valid_row_end,
IF(OR(ISBLANK(row_end), TRIM(row_end)=""), ROWS(source_data),
IF(AND(row_end < 0, row_end >= -1*ROWS(source_data)),
ROWS(source_data) + row_end + 1,
IF(AND(row_end >= 1, row_end <= ROWS(source_data)), row_end, NA()))),
num_rows,
IF(valid_row_end >= valid_row_start,
valid_row_end - valid_row_start + 1, NA()),
valid_column_start,
IF(OR(ISBLANK(column_start), TRIM(column_start)=""), 1, column_start),
valid_column_end,
IF(OR(ISBLANK(column_end), TRIM(column_end)=""), COLUMNS(source_data),
IF(AND(column_end < 0, column_end >= -1*COLUMNS(source_data)),
COLUMNS(source_data) + column_end + 1,
IF(AND(column_end >= 1, column_end <= COLUMNS(source_data)),
column_end, NA()))),
num_cols,
IF(valid_column_end >= valid_column_start,
valid_column_end - valid_column_start + 1, NA()),
INDEX(source_data,
SEQUENCE(num_rows, 1, row_start, 1),
SEQUENCE(1, num_cols, column_start, 1))))(A1#, , 3, 4, -1)
Allowing for various step sizes and negative step sizes to reverse the indexing order can be done as follows:
=LAMBDA(source_data,row_start,row_end,row_step,column_start,column_end,col_step,
LET(
num_rows,
row_end - row_start + 1,
num_cols,
column_end - column_start + 1,
valid_row_step,
IF(OR(ISBLANK(row_step), TRIM(row_step)="", row_step=0), 1, row_step),
valid_col_step,
IF(OR(ISBLANK(col_step), TRIM(col_step)="", col_step=0), 1, col_step),
INDEX(source_data,
SEQUENCE(CEILING.MATH(ABS(num_rows/valid_row_step)), 1,
IF(valid_row_step < 0, row_end, row_start), valid_row_step),
SEQUENCE(1, CEILING.MATH(ABS(num_cols/valid_col_step)),
IF(valid_col_step < 0, column_end, column_start), valid_col_step)))
)(A1#, 1, 9, 3, 1, 9, -2)
Note that I have to recalculate the number of rows and columns taken for the custom step sizes.
Of course, these two customizations can be joined together. I will leave this for you to do as a fun exercise. 😎
Bonus — Return slice as a flat array
The INDEX dynamic array function returns a “flat” one-dimensional array by default. Therefore, in case the requirement is to return the slice of data “flattened”, the INDEX function can do that.
Instead of returning the 3-by-3 slice of the first 3 rows and 3 columns as follows =INDEX(A1:F6, TRANSPOSE({1, 2, 3}), {1, 2, 3})
, then try this:
=INDEX(A1:F6, {1,1,1,2,2,2,3,3,3}, {1,2,3,1,2,3,1,2,3})
to obtain the 1D array i.e., returning the slice row by row.
To automate this, one would have to get creative with the SEQUENCE function and a bit of math:
=INDEX(A1:F6, CEILING.MATH(SEQUENCE(, J3, J4) / J2),
MOD(SEQUENCE(, J3, J4-1), J2) + 1)
If you are interested in how to automate these number sequences in detail, then please check out the articles: repeating numbers in ascending order and repeating a sequence of numbers multiple times.
The following is a suggestion of a possible way of organizing the inputs for this formula.
We could also return the flattened slice as column by column. This can be achieved by simply swapping the rows and columns input as follows:
=INDEX(A1:F6, {1,2,3,1,2,3,1,2,3}, {1,1,1,2,2,2,3,3,3})
I shall leave this up to you to have fun with. Enjoy!
Downloads
You are welcome to download a copy of the example file. Thank you for reading my article and I hope you found it useful.
Before you open the file downloaded from GitHub for the first time, make sure to right-click on the file and select Properties. In General > Security, select the “Unblock” check-box and then click the “Apply” button.