How to slice data in Excel without VBA using the TAKE function
Slicing a two-dimensional data range is now possible in Excel thanks to Dynamic Array Functions introduced in September 2018. Throughout this series I will demonstrate how to use built-in functions such as TAKE, OFFSET, INDEX, CHOOSEROWS and CHOOSECOLS to slice data like in Python, MATLAB, etc. without writing a single line of VBA code.
This tutorial focuses on how to use the build-in Excel function TAKE to slice data in Excel. Then we will use the LAMBDA function to automate the slicing functionality to achieve a similar look and feel to Python, etc.
If you would rather prefer to skip all the explanations and jump straight to the formula, then please go directly to the summary! 🚀
Table of Contents
· What does it mean to “slice” data?
· Part I — Getting familiar with the TAKE function
∘ Example — Return the first 3 rows and first 3 columns
∘ Example — Return the last 3 rows and last 3 columns
∘ Example — Return the entire 2nd row
∘ Example — Return the entire 2nd column
∘ Example — Return the central rows 3 to 4 and columns 3 to 4
∘ Get a data point from the sliced data range
· Part II — Automating the slice function
∘ Basic functionality replication
∘ Advanced functionality replication
∘ Storing the functionality as a LAMBDA function
· Summary and the revised formulas
∘ Further improvements
· Downloads
What does it mean to “slice” data?
To “slice” data simply means to select a specific range or subset of data to be returned back to the user for a particular data analysis purpose.
The idea of slicing can be easily shown in GNU Octave Online at https://octave-online.net/ without having to install any software. Write the following code in the Octave Command Prompt to generate a 6-by-6 matrix containing integers from 1 to 36:
data = reshape(1:36, [6, 6]).'
To slice this matrix all you have to do is specify which row and column ranges you would like returned. For example, the following command would return rows 1 to 3 and columns 4 to 6:
data(1:3, 4:6)
If you are more comfortable with Python, then try the following code in your favourite IDE or try it online at https://jupyter.org/try in a new Jupyter Notebook:
import numpy as np
data = np.reshape(np.arange(1, 37), (6, 6))
print(data)
Note that indexing in Python starts at 0 and the last index is not included. To call rows 1 to 3, one would have to call index 0 to 3 where row index 3 is excluded. To return the same slice of data as above, run the command:
data[:3, 3:]
This slicing functionality is exactly what we will try to replicate in Excel without writing a UDF in VBA. We will use built-in functions only and a little bit of creativity.
Part I — Getting familiar with the TAKE function
Note: The TAKE function is only available for Excel for Microsoft 365 and Excel for the web subscribers.
Firstly, you may ask what are dynamic array functions? This functionality was introduced in September 2018 with Excel 2021 which automatically “spill” the formulas in the neighbouring blank cells. Thus making the legacy Ctrl+Shift+Enter
(CSE) array formulas redundant.
The syntax for this function is =TAKE(array, rows, [columns])
with the following input arguments:
- array is the range from which to take rows and columns.
- rows is the number of rows to take. Negative values takes from the end of the array.
- columns is the optional number of columns to take. Negative values takes from the end of the array.
The TAKE function thus spills the requested dynamic array starting from the cell that the formula is executed in.
Given the following data in an Excel spreadsheet:
Example — Return the first 3 rows and first 3 columns
Type the following formula in another cell =TAKE(A1:F6, 3, 3)
and this should return:
Example — Return the last 3 rows and last 3 columns
Making use of the negative indices, type the following formula in another cell =TAKE(A1:F6, -3, -3)
and you should get:
Example — Return the entire 2nd row
To extract only one row at a time, we need to nest the TAKE formula inside another TAKE formula. If we run the formula =TAKE(A1:F6, 2)
, it will take the first two rows. To obtain the second row, we need to run the nested formula=TAKE(TAKE(A1:F6, 2), -1)
to return the last row of the first two rows indicated by the -1
:
Example — Return the entire 2nd column
We make use of nesting following the logic above to return the second column. This time we leave the rows argument blank as follows =TAKE(TAKE(A1:F6, , 2), , -1)
and you should obtain:
Example — Return the central rows 3 to 4 and columns 3 to 4
Following the logic of nesting two TAKE functions, we can obtain say the central two rows and two columns of our data range. We take the first 4 rows and 4 columns, then we take the last 2 rows and 2 columns of the first take, to return the sliced dynamic array from rows 3 to 4 and columns 3 to 4. Type the following in a new cell =TAKE(TAKE(A1:F6, 4, 4), -2, -2)
and you should get:
Get a data point from the sliced data range
How to make use of the new sliced dynamic array? For example, let’s say that you ran the formula =TAKE(TAKE(A1:F6, 4, 4), -2, -2)
in cell A9. To obtain the data point in the 2nd row and 2nd column of the sliced data, we then make use of the INDEX function =INDEX(A9#, 2, 2)
in another cell. Note that the hash symbol specifies the cell where the dynamic array starts to spill from. Alternatively, simply enter =B10
into the new cell, and it should return the correct value from the dynamic array.
Part II — Automating the slice function
Automating the slice functionality can be achieved similarly to the functionality in Python, MATLAB, etc. with a bit of thought and creativity. That is, we would like to construct a named function such as =SLICE(source_data, row_start, row_end, column_start, column_end)
as not to repeat complicated nesting functions in Excel, and perhaps add some validation logic.
Warning: The logic will get more complicated from here on! This section is for the adventurous few who enjoy building complicated formulas.
Basic functionality replication
We need to build the basic concept of how to use the two nested TAKE functions in general i.e., =TAKE(TAKE(source_data, inner_rows, inner_columns), outer_rows, outer_columns)
. So the first job is to workout what the inner_rows, inner_columns, outer_rows and outer_columns formulas should be for the given inputs row_start, row_end, column_start and column_end. Obviously, the source_data will be range A1:F6.
Here we have the inner and outer arguments calculated very simply as:
inner_rows =J2
inner_columns =J4
outer_rows =J1 - J2 - 1
outer_columns =J3 - J4 - 1
The resulting dynamic array in P1# follows the basic concept above of =TAKE(TAKE(A1:F6, M1, M2), M3, M4)
.
Advanced functionality replication
Next, we can customize our logic further to cater for blank arguments and the end of the array arguments.
We start first with providing the user the option to keep arguments blank as it is done in Python, etc. If the start arguments are blank, we simply set them to 1 in our calculations, else if the end arguments are blank, then we set them to the rows count or the columns count of the source range.
Note that I have left cells J1 and J4 blank on purpose. Thus, the Excel formulas in column M are starting to get a bit more complicated:
inner_rows
=IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6), J2)
inner_columns
=IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6), J4)
outer_rows
=IF(OR(ISBLANK(J1), TRIM(J1)=""), 1, J1)
- IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6), J2) - 1
outer_columns
=IF(OR(ISBLANK(J3), TRIM(J3)=""), 1, J3)
- IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6), J4) - 1
Next, we add some extra validation for the input arguments and we also check if the end arguments are negative like in Python. For example, if column_end = -1
then it means the user is asking for the last column. If column_end = -2
then it means the user is asking for the second last column, and so on. Equivalently, in MATLAB we would specify the end of the array argument asend
. I find it easier to follow Python conventions using negative numbers to specify the location from the end of the array. See next the use of negative arguments in cells J2 and J4.
Therefore, the Excel formulas in column M for the inner and outer intermediate parameters have evolved to the following concoctions:
inner_rows
=IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6),
IF(AND(J2<0, J2>= -1*ROWS(A1:F6)), ROWS(A1:F6) + J2 + 1,
IF(AND(J2>=1, J2<=ROWS(A1:F6)), J2, NA())))
inner_columns
=IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6),
IF(AND(J4<0, J4>= -1*COLUMNS(A1:F6)), COLUMNS(A1:F6) + J4 + 1,
IF(AND(J4>=1, J4<=COLUMNS(A1:F6)), J4, NA())))
outer_rows
=IF(OR(ISBLANK(J1), TRIM(J1)=""), 1,
IF(AND(J1>=1, J1<=ROWS(A1:F6)), J1, NA()))
- IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6),
IF(AND(J2<0, J2>= -1*ROWS(A1:F6)), ROWS(A1:F6) + J2 + 1,
IF(AND(J2>=1, J2<=ROWS(A1:F6)), J2, NA()))) - 1
outer_columns
=IF(OR(ISBLANK(J3), TRIM(J3)=""), 1,
IF(AND(J3>=1, J3<=COLUMNS(A1:F6)), J3, NA()))
- IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6),
IF(AND(J4<0, J4>= -1*COLUMNS(A1:F6)), COLUMNS(A1:F6) + J4 + 1,
IF(AND(J4>=1, J4<=COLUMNS(A1:F6)), J4, NA()))) - 1
So far, all of the intermediate parameters calculated in column M have been simply input in cell P1 as =TAKE(TAKE(A1:F6, M1, M2), M3, M4)
. It is possible to consolidate all of these formulas into one gigantic formula, but it is not very pretty. Hence, the need to write a LAMBDA function which is much tidier.
Storing the functionality as a LAMBDA function
Note: The LAMBA function is only available for Excel for Microsoft 365 and Excel for the web subscribers.
We can package all the logic nicely in a LAMBDA function for further use in the current spreadsheet or to export it to another Excel file. As mentioned earlier, the general basic concept is to use the two nested TAKE functions i.e., =TAKE(TAKE(source_data, inner_rows, inner_columns), outer_rows, outer_columns)
. If we do consolidate the intermediate parameter calculations into one formula, then it would look something like this:
=TAKE(TAKE(A1:F6, IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6),
IF(AND(J2<0, J2>= -1*ROWS(A1:F6)), ROWS(A1:F6) + J2 + 1,
IF(AND(J2>=1, J2<=ROWS(A1:F6)), J2, NA()))),
IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6),
IF(AND(J4<0, J4>= -1*COLUMNS(A1:F6)), COLUMNS(A1:F6) + J4 + 1,
IF(AND(J4>=1, J4<=COLUMNS(A1:F6)), J4, NA())))),
IF(OR(ISBLANK(J1), TRIM(J1)=""), 1,
IF(AND(J1>=1, J1<=ROWS(A1:F6)), J1, NA()))
- IF(OR(ISBLANK(J2), TRIM(J2)=""), ROWS(A1:F6),
IF(AND(J2<0, J2>= -1*ROWS(A1:F6)), ROWS(A1:F6) + J2 + 1,
IF(AND(J2>=1, J2<=ROWS(A1:F6)), J2, NA()))) - 1,
IF(OR(ISBLANK(J3), TRIM(J3)=""), 1,
IF(AND(J3>=1, J3<=COLUMNS(A1:F6)), J3, NA()))
- IF(OR(ISBLANK(J4), TRIM(J4)=""), COLUMNS(A1:F6),
IF(AND(J4<0, J4>= -1*COLUMNS(A1:F6)), COLUMNS(A1:F6
) + J4 + 1,
IF(AND(J4>=1, J4<=COLUMNS(A1:F6)), J4, NA()))) - 1)
Warning: The monster formula above may cause serious headaches!
Nevertheless, the monster formula will be useful in building the LAMBDA function which is specified as follows =LAMBDA(parameters, calculation)(input_arguments)
. Therefore, our lambda function would be structured as as =LAMBDA(source_data, row_start, row_end, column_start, column_end, calculation_formula)(A1:F6, J1, J2, J3, J4)
, where the calculation_formula
is the monster formula above.
Using an editor such as Notepad++, we can replace the range references in the gigantic formula for the parameter names specified above. Thus, we get the following lambda formula for slicing in Excel:
=LAMBDA(source_data,row_start,row_end,column_start,column_end,
TAKE(TAKE(source_data, IF(OR(ISBLANK(row_end), TRIM(row_end)=""),
ROWS(source_data), IF(AND(row_end<0, row_end>= -1*ROWS(source_data)),
ROWS(source_data) + row_end + 1,
IF(AND(row_end>=1, row_end<=ROWS(source_data)), row_end, NA()))),
IF(OR(ISBLANK(column_end), TRIM(column_end)=""), COLUMNS(source_data),
IF(AND(column_end<0, column_end>= -1*COLUMNS(source_data)),
COLUMNS(source_data) + column_end + 1,
IF(AND(column_end>=1, column_end<=COLUMNS(source_data)), column_end,
NA())))), IF(OR(ISBLANK(row_start), TRIM(row_start)=""), 1,
IF(AND(row_start>=1, row_start<=ROWS(source_data)), row_start, NA()))
- IF(OR(ISBLANK(row_end), TRIM(row_end)=""), ROWS(source_data),
IF(AND(row_end<0, row_end>= -1*ROWS(source_data)),
ROWS(source_data) + row_end + 1,
IF(AND(row_end>=1, row_end<=ROWS(source_data)), row_end, NA()))) - 1,
IF(OR(ISBLANK(column_start), TRIM(column_start)=""), 1,
IF(AND(column_start>=1, column_start<=COLUMNS(source_data)), column_start,
NA())) - IF(OR(ISBLANK(column_end), TRIM(column_end)=""),
COLUMNS(source_data), IF(AND(column_end<0, column_end>= -1*
COLUMNS(source_data)), COLUMNS(source_data) + column_end + 1,
IF(AND(column_end>=1, column_end<=COLUMNS(source_data)), column_end,
NA()))) - 1))
This lambda formula was tested by inserting the whole formula in another cell =LAMBDA(…)(A1:F6, J1, J2, J3, J4)
, note the input arguments at the end.
The last step is to save the gigantic lambda formula in the Name Manger, which can be found in the Formulas ribbon. NB: Copy only the formula LAMDA(…)
into the “Refers to” section and not the input arguments (A1:F6, J1, J2, J3, J4)
.
The arguments will be input as follows in the new function SLICE(A1:F6, J1, J2, J3, J4)
. And voila!
With regards to “exporting” this named lambda function to another Excel workbook, simply copy a blank sheet from your current workbook where the LAMBDA named function has been saved to a new workbook.
Summary and the revised formulas
I realise that the formulas presented above are much too complicated and cumbersome. Therefore, I have revised the messy formulas to make them more efficient and easily readable.
The main idea of slicing data in Excel with the TAKE function is simply to nest two function as:
=TAKE(TAKE(source_data, inner_rows, inner_columns), outer_rows, outer_columns)
where the indexing parameters are as follows:
inner_rows = row_end,
inner_columns = column_end,
outer_rows = row_start - inner_end - 1,
outer_columns = column_start – inner_columns – 1.
The “inner take” obtains the first subset from the very first cell of the data range up to the specified end indices. The “outer take” trims this subset further to obtain the subset starting a the specified starting indices.
The first step for automating the 2D array slicer is achieved with the LAMBDA function:
=LAMBDA(
source_data,
row_start,
row_end,
column_start,
column_end,
TAKE(TAKE(source_data, row_end, column_end),
row_start - row_end - 1,
column_start - column_end - 1))
Then we tidy up the LAMBDA using the LET function by introducing intermediate variables that will hold values and calculations:
=LAMBDA(source_data, row_start, row_end, column_start, column_end,
LET(
inner_rows,
row_end,
inner_columns,
column_end,
outer_rows,
row_start - row_end - 1,
outer_columns,
column_start - column_end - 1,
TAKE(TAKE(source_data, inner_rows, inner_columns),
outer_rows, outer_columns)))
Therefore, this is a much cleaner automation of the SLICE function, and it should easily read like the main idea above.
Further improvements
Validation is an important step to writing fail-safe functions. Checking for blank start and end indices, allowing for negative indices and validating that the start index is less than or equal to the end index for both rows and columns can be done as follows:
=LAMBDA(source_data,row_start,row_end,column_start,column_end,
LET(
valid_start_row,
IF(OR(ISBLANK(row_start), TRIM(row_start)=""), 1,
IF(AND(row_start>=1, row_start<=ROWS(source_data)), row_start, NA())),
valid_end_row,
IF(OR(ISBLANK(row_end), TRIM(row_end)=""), ROWS(source_data),
IF(AND(row_end < 0, row_end >= -1*ROWS(source_data)), ROWS(source_data)
+ row_end + 1,
IF(AND(row_end >= 1, row_end <= ROWS(source_data)), row_end, NA()))),
inner_rows,
IF(valid_end_row >= valid_start_row, valid_end_row, NA()),
outer_rows,
valid_start_row - inner_rows - 1,
valid_start_column,
IF(OR(ISBLANK(column_start), TRIM(column_start)=""), 1,
IF(AND(column_start >= 1, column_start <= COLUMNS(source_data)),
column_start, NA())),
valid_end_column,
IF(OR(ISBLANK(column_end), TRIM(column_end)=""), COLUMNS(source_data),
IF(AND(column_end < 0, column_end>= -1*COLUMNS(source_data)),
COLUMNS(source_data) + column_end + 1,
IF(AND(column_end>=1, column_end<=COLUMNS(source_data)), column_end,
NA()))),
inner_columns,
IF(valid_end_column >= valid_start_column, valid_end_column, NA()),
outer_columns,
valid_start_column - inner_columns - 1,
TAKE(TAKE(source_data, inner_rows, inner_columns),
outer_rows, outer_columns))
)(A1#, 2, 3, 4, -1)
Allowing for various step sizes and negative step sizes to reverse the indexing order can be done as follows:
=LAMBDA(source_data,row_start,row_end,row_step,column_start,column_end,col_step,
LET(
inner_rows,
row_end,
inner_columns,
column_end,
outer_rows,
row_start - inner_rows - 1,
outer_columns,
column_start - inner_columns - 1,
valid_row_step,
IF(OR(ISBLANK(row_step), TRIM(row_step)="", row_step=0), 1, row_step),
valid_col_step,
IF(OR(ISBLANK(col_step), TRIM(col_step)="", col_step=0), 1, col_step),
INDEX(TAKE(TAKE(source_data, inner_rows, inner_columns),
outer_rows, outer_columns),
SEQUENCE(CEILING.MATH(ABS(outer_rows/valid_row_step)),, IF(
valid_row_step < 0, ABS(outer_rows), 1), row_step),
SEQUENCE(,CEILING.MATH(ABS(outer_columns/valid_col_step)), IF(
valid_col_step < 0, ABS(outer_columns), 1), col_step))
))(A1#, 2, 6, -2, 2, 6, -2)
Note that I am making use of the INDEX and SEQUENCE function to accommodate for the custom step sizes, and I have to recalculate the number of rows and columns taken for the custom step sizes.
Of course, these two customizations can be joined together. I will leave this for you to do as a fun exercise. 😎
Downloads
You are welcome to download a copy of the example file. Thank you for reading my article and I hope you found it useful.
Before you open the file downloaded from GitHub for the first time, make sure to right-click on the file and select Properties. In General > Security, select the “Unblock” check-box and then click the “Apply” button.