Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset is organised as follows:- There are 3 folders (room_1/2/3), which indicate data collected from 3 different rooms- In each of that rooms there is a different number of folders indicating different data capturing session- In each of that folder, there is data.csv files, which stores CSI data for each packet. Also, there are label.csv and label_boxes.csv which contain labels for activities and person bounding box respectively.Dataset characteristic:Activities - walking, sitting, standing, lying, getting up,getting down, no activity# of people involved - 1# of rooms used - 3WiFi router - TP-Link TL-WDR4300Channel - 60Bandwidth - 40MHzFrequency - 5 GHzAntennas - 2Rx x 2Tx# of subcarriers - 114Check out http://github.com/retsediv/WIFI_CSI_based_HAR in case if you need the source code of data collection process, processing, analysis or model developmentDisclaimer: All data were collected by myself and the only person performing activities was me. My purpose was to make the data easier available to the community for further research in time on COVID-19.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the Wallhack1.8k dataset for WiFi-based long-range activity recognition in Line-of-Sight (LoS) and Non-Line-of-Sight (NLoS)/Through-Wall scenarios, as proposed in [1,2], as well as the CAD models (of 3D-printable parts) of the WiFi systems proposed in [2].
PyTroch Dataloader
A minimal PyTorch dataloader for the Wallhack1.8k dataset is provided at: https://github.com/StrohmayerJ/wallhack1.8k
Dataset Description
The Wallhack1.8k dataset comprises 1,806 CSI amplitude spectrograms (and raw WiFi packet time series) corresponding to three activity classes: "no presence," "walking," and "walking + arm-waving." WiFi packets were transmitted at a frequency of 100 Hz, and each spectrogram captures a temporal context of approximately 4 seconds (400 WiFi packets).
To assess cross-scenario and cross-system generalization, WiFi packet sequences were collected in LoS and through-wall (NLoS) scenarios, utilizing two different WiFi systems (BQ: biquad antenna and PIFA: printed inverted-F antenna). The dataset is structured accordingly:
LOS/BQ/ <- WiFi packets collected in the LoS scenario using the BQ system
LOS/PIFA/ <- WiFi packets collected in the LoS scenario using the PIFA system
NLOS/BQ/ <- WiFi packets collected in the NLoS scenario using the BQ system
NLOS/PIFA/ <- WiFi packets collected in the NLoS scenario using the PIFA system
These directories contain the raw WiFi packet time series (see Table 1). Each row represents a single WiFi packet with the complex CSI vector H being stored in the "data" field and the class label being stored in the "class" field. H is of the form [I, R, I, R, ..., I, R], where two consecutive entries represent imaginary and real parts of complex numbers (the Channel Frequency Responses of subcarriers). Taking the absolute value of H (e.g., via numpy.abs(H)) yields the subcarrier amplitudes A.
To extract the 52 L-LTF subcarriers used in [1], the following indices of A are to be selected:
csi_valid_subcarrier_index = [] csi_valid_subcarrier_index += [i for i in range(6, 32)] csi_valid_subcarrier_index += [i for i in range(33, 59)]
Additional 56 HT-LTF subcarriers can be selected via:
csi_valid_subcarrier_index += [i for i in range(66, 94)]
csi_valid_subcarrier_index += [i for i in range(95, 123)]
For more details on subcarrier selection, see ESP-IDF (Section Wi-Fi Channel State Information) and esp-csi.
Extracted amplitude spectrograms with the corresponding label files of the train/validation/test split: "trainLabels.csv," "validationLabels.csv," and "testLabels.csv," can be found in the spectrograms/ directory.
The columns in the label files correspond to the following: [Spectrogram index, Class label, Room label]
Spectrogram index: [0, ..., n]
Class label: [0,1,2], where 0 = "no presence", 1 = "walking", and 2 = "walking + arm-waving."
Room label: [0,1,2,3,4,5], where labels 1-5 correspond to the room number in the NLoS scenario (see Fig. 3 in [1]). The label 0 corresponds to no room and is used for the "no presence" class.
Dataset Overview:
Table 1: Raw WiFi packet sequences.
Scenario System "no presence" / label 0 "walking" / label 1 "walking + arm-waving" / label 2 Total
LoS BQ b1.csv w1.csv, w2.csv, w3.csv, w4.csv and w5.csv ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv
LoS PIFA b1.csv w1.csv, w2.csv, w3.csv, w4.csv and w5.csv ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv
NLoS BQ b1.csv w1.csv, w2.csv, w3.csv, w4.csv and w5.csv ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv
NLoS PIFA b1.csv w1.csv, w2.csv, w3.csv, w4.csv and w5.csv ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv
4 20 20 44
Table 2: Sample/Spectrogram distribution across activity classes in Wallhack1.8k.
Scenario System
"no presence" / label 0
"walking" / label 1
"walking + arm-waving" / label 2 Total
LoS BQ 149 154 155
LoS PIFA 149 160 152
NLoS BQ 148 150 152
NLoS PIFA 143 147 147
589 611 606 1,806
Download and UseThis data may be used for non-commercial research purposes only. If you publish material based on this data, we request that you include a reference to one of our papers [1,2].
[1] Strohmayer, Julian, and Martin Kampel. (2024). “Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition”, In IFIP International Conference on Artificial Intelligence Applications and Innovations (pp. 42-56). Cham: Springer Nature Switzerland, doi: https://doi.org/10.1007/978-3-031-63211-2_4.
[2] Strohmayer, Julian, and Martin Kampel., “Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition,” 2024 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 2024, pp. 3594-3599, doi: https://doi.org/10.1109/ICIP51287.2024.10647666.
BibTeX citations:
@inproceedings{strohmayer2024data, title={Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition}, author={Strohmayer, Julian and Kampel, Martin}, booktitle={IFIP International Conference on Artificial Intelligence Applications and Innovations}, pages={42--56}, year={2024}, organization={Springer}}@INPROCEEDINGS{10647666, author={Strohmayer, Julian and Kampel, Martin}, booktitle={2024 IEEE International Conference on Image Processing (ICIP)}, title={Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition}, year={2024}, volume={}, number={}, pages={3594-3599}, keywords={Visualization;Accuracy;System performance;Directional antennas;Directive antennas;Reflector antennas;Sensors;Human Activity Recognition;WiFi;Channel State Information;Through-Wall Sensing;ESP32}, doi={10.1109/ICIP51287.2024.10647666}}
Facebook
TwitterDescription:
The Behavior-based WiFi User Dataset for user authentication. This dataset contains the physiological characteristics captured by WiFi from 10 participants for 10 different activities. Each participant performs 20 rounds for each activity. The experiments are conducted in two different environments, the campus office, and the home apartment. The dataset can be used by fellow researchers to reproduce the original work or to further explore other machine-learning problems in the domain of WiFi sense.
Format: .dat format
Section 1: Device Configuration
Two commercial laptops, Dell E6430, as transmitter and receiver. Run with a Linux 14.04 operating system with 4.2.0 kernel. Equipped with 3 MINI PCI-E internal antennas.
Intel 5300 network interface card (NIC) for CSI collection. The detail information regarding the CSI tool can be found at https://dhalperi.github.io/linux-80211n-csitool/faq.html.
WiFi packet transmission is set to 1000 pkts/s
Section 2: Data Format
We provide raw data received by the CSI tool. The data files are saved in the dat format. The details are shown in the following:
10 participants are included in two different experiments.
Each participant performed 20 rounds for each activity.
The dataset file name is presented as "User_Day_Action_Location". The detailed information as:
User: The participants that CSI was collected from.
Day: The date this data was collected.
Action: The specific activity performed.
Location: The specific location the experiment was conducted.
Section 3: Experimental Setups
There are two experiment setups for our data collection. An image of the experimental setup and the illustration of activities from two different environments is included in the dataset. Each activity was performed in a designated location. In each activity location, the specific activity was conducted in 4 different proximate locations at least one foot away from each other.
Residential Apartment
Environment: The experiments are conducted in a residential apartment with a size 33ft × 17ft.
Participant: 10 users are students from Rutgers University (aged from 20 to 30).
Activity: 7 activities were performed.
Detailed Activities Performed in Apartment
Code
Activity
A→B
Walking (trajectory 1)
B→C
Walking (trajectory 2)
B
Picking up a remote control
C
Sitting in a chair
D
Exercising
E
Operating on the oven
F
Using the stove
Office
Environment: The experiments are conducted in an office with a size 21ft × 12ft.
Participant: 5 users are students from Rutgers University (aged from 20 to 30).
Activity: 3 activities were performed.
Detailed Activities Performed in Office
Code
Activity
G
Sitting in a seat
H
Stretching the body
I
Typing on a keyboard
Section 4: Data Description
We separate our raw data into different folders based on different environment types. In each environment type, data are further distributed in terms of date. Each file includes all data from three internal antennas. All data files are in .dat format. We also provide Matlab scripts for CSI analysis and visualization. The following variables can be revealed from the codes:
CSI: This is the Channel State Information (CSI) received from one receiver antenna. It describes the signal propagation from the transmitter to the receiver, and it is very sensitive to the impact of environmental changes. Each data reveals CSI from 30 subcarriers.
Relative Phase: Relative Phase is a measurement to describe the degree of synchronization between data received from different antennas. It can be used to determine the phase offset for further signal preprocessing.
Time: This is the time interval in which the data file contains. It measures time by the number of seconds. It can be used to determine how long the signal has been received.
Section 5: Codes
analysis_spectrogram.m: load a .dat file and extract all data by Data description(I.e, CSI, and Relative Phase).
Section 6: Citations
If your paper is related to our works, please cite our papers as follows.
https://ieeexplore.ieee.org/document/9356038
C. Shi, J. Liu, N. Borodinov, B. Leao and Y. Chen, "Towards Environment-independent Behavior-based User Authentication Using WiFi," 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), Delhi, India, 2020, pp. 666-674, doi: 10.1109/MASS50613.2020.00086
Bibtex:
@INPROCEEDINGS{9356038, author={Shi, Cong and Liu, Jian and Borodinov, Nick and Leao, Bruno and Chen, Yingying}, booktitle={2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)}, title={Towards Environment-independent Behavior-based User Authentication Using WiFi}, year={2020}, volume={}, number={}, pages={666-674}, doi={10.1109/MASS50613.2020.00086}}
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description:
This environment-independent user authentication dataset is from our MASS 2020 paper: Towards Environment-independent Behavior-based User Authentication Using WiFi. This dataset contains the physiological characteristics captured by WiFi from 10 participants for 10 different activities. Each participant performs 20 rounds for each activity. The experiments are conducted in two different environments, the campus office, and the home apartment. The system performance is tested on the cross-environment scenarios (training in one environment and testing in another environment).
Note: The MASS 2020 paper is based on our MobiHoc 2017 paper, Smart User Authentication through Actuation of Daily Activities Leveraging WiFi-enabled IoT. The MobiHoc 2017 work focused on user authentication using CSI extracted from human activity while the MASS 2020 work focused on the domain adaptation of user authentication using activity CSI.
The dataset of our MobiHoc 2017 work is also published: https://zenodo.org/record/7750976#.ZBfTZ3bMKUk
Format: .dat format
Section 1: Device Configuration
Section 2: Data Format
We provide raw data received by the CSI tool. The data files are saved in the dat format. The details are shown in the following:
Section 3: Experimental Setups
There are two experiment setups for our data collection. An image of the experimental setup and the illustration of activities from two different environments is included in the dataset. Each activity was performed in a designated location. In each activity location, the specific activity was conducted in 4 different proximate locations at least one foot away from each other.
| Code | Activity |
| A→B | Walking (trajectory 1) |
| B→C | Walking (trajectory 2) |
| B | Picking up a remote control |
| C | Sitting in a chair |
| D | Exercising |
| E | Operating on the oven |
| F | Using the stove |
| Code | Activity |
| G | Sitting in a seat |
| H | Stretching the body |
| I | Typing on a keyboard |
Section 4: Data Description
We separate our raw data into different folders based on different environment types. In each environment type, data are further distributed in terms of date. Each file includes all data from three internal antennas. All data files are in .dat format. We also provide Matlab scripts for CSI analysis and visualization. The following variables can be revealed from the codes:
Section 5: Codes
Section 6: Citations
If your paper is related to our works, please cite our papers as follows.
https://ieeexplore.ieee.org/document/9356038
C. Shi, J. Liu, N. Borodinov, B. Leao and Y. Chen, "Towards Environment-independent Behavior-based User Authentication Using WiFi," 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), Delhi, India, 2020, pp. 666-674, doi: 10.1109/MASS50613.2020.00086
Bibtex:
@INPROCEEDINGS{9356038,
author={Shi, Cong and Liu, Jian and Borodinov, Nick and Leao, Bruno and Chen, Yingying},
booktitle={2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)},
title={Towards Environment-independent Behavior-based User Authentication Using WiFi},
year={2020},
volume={},
number={},
pages={666-674},
doi={10.1109/MASS50613.2020.00086}}
The current version of the dataset is shrunk due to its size. If you wish to acquire the full version or you have any questions regarding the dataset, contact us by email: cl1361@scarletmail.rutgers.edu.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description:
The behavior-based user authentication dataset is collected from the smart user authentication system through daily activities leveraging commodity WiFi. The dataset contains the extracted CSI features from 8 walking activities and 9 stationary activities from 11 and 5 volunteers, respectively. The experiments are conducted in 2 different environments, including a university office and an apartment. We hope this dataset will help researchers to reproduce the former work of user authentication through WiFi sensing.
Dataset Format:
.dat files
Section 1: Device Configuration:
Section 2: Data Format
We provide raw data received by the CSI tool. The data files are saved in the dat format. The details are shown in the following:
Note: we select these data specifically to form the dataset to make it efficent, we did not publish every data that we have collected during paper writing. If you have any question regarding the dataset, please contact us for detail information.
Section 3: Experimental Setups
There are 2 different experiment setups, including a university office and an apartment environment, for our data collection. The detailed setups are shown in the paper. For the activities, we involve 8 walking activities and 8 stationary activities. An image of the experimental setup and the illustration of activities from two different environments is included in the dataset.
| Code | Walking activity | Code | Stationary activity |
| A | Entrance ⇒ Seat | a | Working (i.e., typing keyboard) |
| B | Seat ⇒ Entrance | b | Turning on the light |
| C | Seat ⇒ Light Switch | c | Opening the cabinet |
| D | Light Switch ⇒ Seat | d | Fetching documents |
| E | Seat ⇒ Cabinet | e | Eating at the table |
| F | Cabinet ⇒ Seat | f | Opening the microwave oven |
| G | Entrance ⇒ Kitchen | g | Opening the refrigerator |
| H | Kitchen ⇒ Entrance | h | Opening the door |
Section 4: Data Description
We separate our raw data into different folders based on different environment types. In each environment type, data are further distributed in terms of date. Each file includes all data from three internal antennas. All data files are in .dat format. We also provide Matlab scripts for CSI analysis and visualization. The following variables can be revealed from the codes:
Section 5: Codes
Section 6: Citations
If your work is related to our work, please cite our papers as follows.
https://dl.acm.org/doi/10.1145/3084041.3084061
Cong Shi, Jian Liu, Hongbo Liu, and Yingying Chen. 2017. Smart User Authentication through Actuation of Daily Activities Leveraging WiFi-enabled IoT. In Proceedings of the 18th ACM International Symposium on Mobile Ad Hoc Networking and Computing (Mobihoc '17). Association for Computing Machinery, New York, NY, USA, Article 5, 1–10.
Bibtex:
@inproceedings{shi2017smart,
title={Smart user authentication through actuation of daily activities leveraging WiFi-enabled IoT},
author={Shi, Cong and Liu, Jian and Liu, Hongbo and Chen, Yingying},
booktitle={Proceedings of the 18th ACM International Symposium on Mobile Ad Hoc Networking and Computing},
pages={1--10},
year={2017}
}
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
On the Generalization of WiFi-based Person-centric Sensing in Through-Wall Scenarios
This repository contains the 3DO dataset proposed in [1].
PyTroch Dataloader
A minimal PyTorch dataloader for the 3DO dataset is provided at: https://github.com/StrohmayerJ/3DO
Dataset Description
The 3DO dataset comprises 42 five-minute recordings (~1.25M WiFi packets) of three human activities performed by a single person, captured in a WiFi through-wall sensing scenario over three consecutive days. Each WiFi packet is annotated with a 3D trajectory label and a class label for the activities: no person/background (0), walking (1), sitting (2), and lying (3). (Note: The labels returned in our dataloader example are walking (0), sitting (1), and lying (2), because background sequences are not used.)
The directories 3DO/d1/, 3DO/d2/, and 3DO/d3/ contain the sequences from days 1, 2, and 3, respectively. Furthermore, each sequence directory (e.g., 3DO/d1/w1/) contains a csiposreg.csv file storing the raw WiFi packet time series and a csiposreg_complex.npy cache file, which stores the complex Channel State Information (CSI) of the WiFi packet time series. (If missing, csiposreg_complex.npy is automatically generated by the provided dataloader.)
Dataset Structure:
/3DO
├── d1 <-- day 1 subdirectory
└── w1 <-- sequence subdirectory
└── csiposreg.csv <-- raw WiFi packet time series
└── csiposreg_complex.npy <-- CSI time series cache
├── d2 <-- day 2 subdirectory
├── d3 <-- day 3 subdirectory
In [1], we use the following training, validation, and test split:
| Subset | Day | Sequences |
| Train | 1 | w1, w2, w3, s1, s2, s3, l1, l2, l3 |
| Val | 1 | w4, s4, l4 |
| Test | 1 | w5 , s5, l5 |
| Test | 2 | w1, w2, w3, w4, w5, s1, s2, s3, s4, s5, l1, l2, l3, l4, l5 |
| Test | 3 | w1, w2, w4, w5, s1, s2, s3, s4, s5, l1, l2, l4 |
w = walking, s = sitting and l= lying
Note: On each day, we additionally recorded three ten-minute background sequences (b1, b2, b3), which are provided as well.
Download and Use
This data may be used for non-commercial research purposes only. If you publish material based on this data, we request that you include a reference to our paper [1].
[1] Strohmayer, J., Kampel, M. (2025). On the Generalization of WiFi-Based Person-Centric Sensing in Through-Wall Scenarios. In: Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15315. Springer, Cham. https://doi.org/10.1007/978-3-031-78354-8_13
BibTeX citation:
@inproceedings{strohmayerOn2025,
author="Strohmayer, Julian and Kampel, Martin",
title="On the Generalization of WiFi-Based Person-Centric Sensing in Through-Wall Scenarios",
booktitle="Pattern Recognition",
year="2025",
publisher="Springer Nature Switzerland",
address="Cham",
pages="194--211",
isbn="978-3-031-78354-8"
}
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset is organised as follows:- There are 3 folders (room_1/2/3), which indicate data collected from 3 different rooms- In each of that rooms there is a different number of folders indicating different data capturing session- In each of that folder, there is data.csv files, which stores CSI data for each packet. Also, there are label.csv and label_boxes.csv which contain labels for activities and person bounding box respectively.Dataset characteristic:Activities - walking, sitting, standing, lying, getting up,getting down, no activity# of people involved - 1# of rooms used - 3WiFi router - TP-Link TL-WDR4300Channel - 60Bandwidth - 40MHzFrequency - 5 GHzAntennas - 2Rx x 2Tx# of subcarriers - 114Check out http://github.com/retsediv/WIFI_CSI_based_HAR in case if you need the source code of data collection process, processing, analysis or model developmentDisclaimer: All data were collected by myself and the only person performing activities was me. My purpose was to make the data easier available to the community for further research in time on COVID-19.