On this tutorial, we’ll cowl the best way to append to a dataframe in R — one of the crucial widespread modifications carried out on a dataframe.
A dataframe is without doubt one of the important information constructions within the R programming language. It is recognized to be very versatile as a result of it could comprise many information varieties and is simple to switch. Probably the most typical modifications carried out on a dataframe in R is including new observations (rows) to it.
On this tutorial, we’ll talk about the alternative ways of appending a number of rows to a dataframe in R. For those who’re new to creating dataframe in R, you might wish to learn Learn how to Create a Dataframe in R earlier than persevering with with this publish.
Earlier than beginning, let’s create a easy dataframe as an experiment:
super_sleepers_initial <- information.body(animal=c('koala', 'hedgehog'),
nation=c('Australia', 'Italy'),
avg_sleep_hours=c(21, 18),
stringsAsFactors=FALSE)
print(super_sleepers_initial)
animal | nation | avg_sleep_hours | |
1 | koala | Australia | 21 |
2 | hedgehog | Italy | 18 |
Word: when creating the above dataframe, we added the non-obligatory parameter
stringsAsFactors=FALSE
. Whereas by default this parameter isTRUE
, within the majority of instances (except we’ve no column of character information sort), it is strongly really useful to set it toFALSE
to suppress the default conversion of character to issue information sort and therefore keep away from undesired uncomfortable side effects. As an experiment, you possibly can take away this parameter from the above piece of code, run this and the next code cells, and observe the outcomes.
Appending a Single Row to a Dataframe in R
Utilizing rbind()
To append one row to a dataframe in R, we will use the rbind()
built-in perform, which stands for “row-bind”. The fundamental syntax is the next:
dataframe <- rbind(dataframe, new_row)
Word that within the above syntax, by new_row
we’ll likely perceive a record reasonably than a vector, except all of the columns of our dataframe are of the identical information sort (which is not steadily the case).
Let’s reconstruct a brand new dataframe super_sleepers
from the preliminary one super_sleepers_initial
and append yet another row to it:
# Reconstructing the super_sleepers
dataframe
super_sleepers <- super_sleepers_initial
super_sleepers <- rbind(super_sleepers, record('sloth', 'Peru', 17))
print(super_sleepers)
animal | nation | avg_sleep_hours | |
1 | koala | Australia | 21 |
2 | hedgehog | Italy | 18 |
3 | sloth | Peru | 17 |
The brand new row was appended to the top of the dataframe.
It is essential to understand that right here and in all the next examples, the brand new row (or rows) should mirror the construction of the dataframe to which it is appended, that means right here that the size of the record needs to be equal to the variety of columns within the dataframe, and the succession of knowledge forms of the objects within the record needs to be the identical because the succession of knowledge forms of the dataframe variables. Within the reverse case, this system will throw an error.
Utilizing nrow()
One other method to append a single row to an R dataframe is by utilizing the nrow()
perform. The syntax is as follows:
dataframe[nrow(dataframe) + 1,] <- new_row
This syntax actually implies that we calculate the variety of rows within the dataframe (nrow(dataframe)
), add 1 to this quantity (nrow(dataframe) + 1
), after which append a brand new row new_row
at that index of the dataframe (dataframe[nrow(dataframe) + 1,]
) — i.e., as a brand new final row.
Simply as earlier than, our new_row
will likely must be a record reasonably than a vector, except all of the columns of the dataframe are of the identical information sort, which isn’t widespread.
Let’s add yet another “super-sleeper” to our desk:
super_sleepers[nrow(super_sleepers) + 1,] <- record('panda', 'China', 10)
print(super_sleepers)
animal | nation | avg_sleep_hours | |
1 | koala | Australia | 21 |
2 | hedgehog | Italy | 18 |
3 | sloth | Peru | 17 |
4 | panda | China | 10 |
Once more, the brand new row was appended to the top of the dataframe.
Utilizing add_row()
of tidyverse
What if, as an alternative, we wish to add a brand new row to not the top of the dataframe however at some particular index of it? For instance, we came upon that tigers sleep 16 hours each day, (i.e., greater than pandas in our ranking, so we have to insert this commentary because the second to the top row of the dataframe). On this case, utilizing the bottom R is not sufficient, however we will use the add_row()
perform of the tidyverse
R package deal (we might have to put in it, if it is not put in but, by working set up.packages("tidyverse")
):
library(tidyverse)
super_sleepers <- super_sleepers
nation='India',
avg_sleep_hours=16,
.earlier than=4)
print(super_sleepers)
animal | nation | avg_sleep_hours | |
1 | koala | Australia | 21 |
2 | hedgehog | Italy | 18 |
3 | sloth | Peru | 17 |
4 | tiger | India | 16 |
5 | panda | China | 10 |
Right here, we should always word the next:
- We handed the identical names of the columns in the identical order as within the current dataframe and assigned the corresponding new values to them.
- After that sequence, we added the
.earlier than
non-obligatory parameter and specified the mandatory index. If we did not try this, the row could be added by default to the top of the dataframe. Alternatively, we may use the.after
non-obligatory parameter and assign to it the index of the row after which to insert the brand new commentary. - We used the task operator
<-
to avoid wasting the modifications utilized to the present dataframe.
Appending A number of Rows to a Dataframe in R
Utilizing rbind()
Typically, we have to append not one however a number of rows to an R dataframe. The only methodology right here is once more to make use of the rbind()
perform. Extra exactly, on this case, we virtually want to mix two dataframe: the preliminary one and the one containing the rows to be appended.
dataframe_1 <- rbind(dataframe_1, dataframe_2)
To see the way it works, let’s reconstruct a brand new DataFrame super_sleepers
from the preliminary one super_sleepers_initial
, print it out to recall what it appears like, and append two new rows to its finish:
# Reconstructing the super_sleepers
dataframe
super_sleepers <- super_sleepers_initial
print(super_sleepers)
cat('nn') # printing an empty line
# Creating a brand new dataframe with the mandatory rows
super_sleepers_2 <- information.body(animal=c('squirrel', 'panda'),
nation=c('Canada', 'China'),
avg_sleep_hours=c(15, 10),
stringsAsFactors=FALSE)
# Appending the rows of the brand new dataframe to the top of the present one
super_sleepers <- rbind(super_sleepers, super_sleepers_2)
print(super_sleepers)
animal | nation | avg_sleep_hours | |
1 | koala | Australia | 21 |
2 | hedgehog | Italy | 18 |
animal | nation | avg_sleep_hours | |
1 | koala | Australia | 21 |
2 | hedgehog | Italy | 18 |
3 | squirrel | Canada | 15 |
4 | tiger | India | 16 |
As a reminder, the brand new rows should mirror the construction of the dataframe to which they’re appended, that means that the variety of columns in each dataframes, the column names, their succession, and information varieties must be the identical.
Utilizing nrow()
Alternatively, we will use the nrow()
perform to append a number of rows to a dataframe in R. Nevertheless, right here this method is not really useful as a result of the syntax turns into very clumsy and tough to learn. Certainly, on this case, we have to calculate the beginning and the top index. For this, we’ve to do rather more manipulation with the nrow()
perform. Technically, the syntax is as follows:
dataframe_1[(nrow(dataframe_1) + 1):(nrow(dataframe_1) + nrow(dataframe_2)),] <- dataframe_2
Let’s reconstruct our super_sleepers
from super_sleepers_initial
and append to them the rows of the already current dataframe super_sleepers_2
:
# Reconstructing the super_sleepers
dataframe
super_sleepers <- super_sleepers_initial
super_sleepers[(nrow(super_sleepers) + 1):(nrow(super_sleepers) + nrow(super_sleepers_2)),] <- super_sleepers_2
print(super_sleepers)
animal | nation | avg_sleep_hours | |
1 | koala | Australia | 21 |
2 | hedgehog | Italy | 18 |
3 | squirrel | Canada | 15 |
4 | panda | China | 10 |
We obtained the identical dataframe as within the earlier instance and noticed that the earlier method is rather more elegant.
Utilizing add_row()
of tidyverse
Lastly, we will use once more the add_row()
perform of the tidyverse
package deal. This method is extra versatile since we will both append the brand new rows on the finish of the present dataframe or insert them earlier than/after a sure row specified by its index.
Let’s insert the observations for sloth and tiger between hedgehog and squirrel within the present dataframe super_sleepers
:
library(tidyverse)
super_sleepers <- super_sleepers
nation=c('Peru', 'India'),
avg_sleep_hours=c(17, 16),
.earlier than=3)
print(super_sleepers)
animal | nation | avg_sleep_hours | |
1 | koala | Australia | 21 |
2 | hedgehog | Italy | 18 |
3 | sloth | Peru | 17 |
4 | tiger | India | 16 |
5 | squirrel | Canada | 15 |
6 | panda | China | 10 |
Word that to acquire the above outcome, we may use .after=2
as an alternative of .earlier than=3
.
Conclusion
On this tutorial, we realized the best way to append a single row (or a number of rows) to a dataframe in R — or the best way to insert it (or them) at a selected index of the dataframe. Specifically, we thought of 3 approaches: utilizing the rbind()
or nrow()
capabilities of the bottom R, or the add_row()
perform of the tidyverse
R package deal. We paid particular consideration to the very best use instances for every methodology and the nuances that must be taken under consideration in numerous conditions.
If you would like to study extra about working with dataframes in R, take a look at Learn how to Add a Column to a Dataframe in R (with 18 Code Examples)