Add short description and remove redundancy in acknowledgements

#2
by egrace479 - opened
Files changed (1) hide show
  1. README.md +21 -97
README.md CHANGED
@@ -17,6 +17,7 @@ tags:
17
  - UAV
18
  - drone
19
  - video
 
20
  ---
21
 
22
  # Model Card for X3D-KABR-Kinetics
@@ -33,15 +34,15 @@ behavioral ecologist.
33
 
34
  ### Model Description
35
 
36
- - **Developed by:** [Maksim Kholiavchenko, Maksim Kukushkin, Otto Brookes, Jenna Kline, Sam Stevens, Isla Duporge, Alec Sheets,
37
  Reshma R. Babu, Namrata Banerji, Elizabeth Campolongo,
38
  Matthew Thompson, Nina Van Tiel, Jackson Miliko,
39
  Eduardo Bessa Mirmehdi, Thomas Schmid,
40
- Tanya Berger-Wolf, Daniel I. Rubenstein, Tilo Burghardt, Charles V. Stewart]
41
 
42
- - **Model type:** [X3D]
43
- - **License:** [MIT]
44
- - **Fine-tuned from model:** [X3D-S, Kinetics]
45
 
46
  This model was developed for the benefit of the community as an open-source product, thus we request that any derivative products are also open-source.
47
 
@@ -88,7 +89,7 @@ for more information on how this model can be used generate time-budgets from ae
88
 
89
  ### Training Data
90
 
91
- [KABR Dataset](https://huggingface.co/datasets/imageomics/KABR)
92
 
93
  ### Training Procedure
94
 
@@ -102,7 +103,7 @@ For each tracklet, we create a separate video, called a mini-scene, by extractin
102
  detection in a video frame.
103
  This allows us to compensate for the drone's movement and provides a stable, zoomed-in representation of the animal.
104
 
105
- See [project page](https://kabrdata.xyz/) and the [paper](https://openaccess.thecvf.com/content/WACV2024W/CV4Smalls/papers/Kholiavchenko_KABR_In-Situ_Dataset_for_Kenyan_Animal_Behavior_Recognition_From_Drone_WACVW_2024_paper.pdf) for data preprocessing details.
106
 
107
  We applied data augmentation techniques during training, including horizontal flipping to randomly
108
  mirror the input frames horizontally and color augmentations to randomly modify the
@@ -114,104 +115,38 @@ The model was trained for 120 epochs, using a batch size of 5.
114
  We used the EQL loss function to address the long-tailed class distribution and SGD optimizer with a learning rate of 1e5.
115
  We used a sample rate of 16x5, and random weight initialization.
116
 
117
- <!-- ADD RESULTS ONCE NEW PAPER PUBLISHED
118
-
119
- <!--
120
- #### Speeds, Sizes, Times
121
-
122
- <!-- [optional] This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
123
- <!--
124
- [More Information Needed]
125
 
126
  ## Evaluation
127
 
128
- <!-- This section describes the evaluation protocols and provides the results. -->
129
- <!--
130
- [More Information Needed]
131
-
132
- ### Testing Data, Factors & Metrics
133
-
134
- #### Testing Data
135
-
136
- <!-- This should link to a Dataset Card if possible, otherwise link to the original source with more info.
137
- Provide a basic overview of the test data and documentation related to any data pre-processing or additional filtering. -->
138
- <!--
139
- [More Information Needed]
140
 
141
- #### Factors
142
 
143
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
144
- <!--
145
- [More Information Needed]
146
 
147
  #### Metrics
148
 
149
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
150
- <!--
151
- [More Information Needed]
152
-
153
- ### Results
154
-
155
- [More Information Needed]
156
-
157
- #### Summary
158
-
159
- [More Information Needed]
160
 
161
- ## Model Examination
162
 
163
- <!-- [optional] Relevant interpretability work for the model goes here -->
164
- <!--
165
- [More Information Needed]
166
 
167
- ## Environmental Impact
168
-
169
- <!--
170
- It would be great to try to include this.
171
-
172
- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
173
- <!--
174
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute)
175
- presented in [Lacoste et al. (2019)](https://doi.org/10.48550/arXiv.1910.09700).
176
-
177
- - **Hardware Type:** [More Information Needed]
178
- - **Hours used:** [More Information Needed]
179
- - **Cloud Provider:** [More Information Needed]
180
- - **Compute Region:** [More Information Needed]
181
- - **Carbon Emitted:** [More Information Needed]
182
-
183
- ## Technical Specifications
184
- [More Information Needed--optional]
185
 
186
  ### Model Architecture and Objective
187
 
188
- [More Information Needed]
189
-
190
- ### Compute Infrastructure
191
-
192
- [More Information Needed]
193
 
194
  #### Hardware
195
 
196
- [More Information Needed: hardware requirements]
197
-
198
- #### Software
199
-
200
- [More Information Needed]
201
 
202
  ## Citation
203
 
204
- <!-- If there is a paper introducing the model, the Bibtex information for that should go in this section.
205
-
206
- See notes at top of file about selecting a license.
207
- If you choose CC0: This model is dedicated to the public domain for the benefit of scientific pursuits.
208
- We ask that you cite the model and journal paper using the below citations if you make use of it in your research.
209
-
210
- -->
211
-
212
  **BibTeX:**
213
 
214
-
215
  If you use our model in your work, please cite the model and associated paper.
216
 
217
  **Model**
@@ -257,13 +192,9 @@ Tanya Berger-Wolf, Daniel I. Rubenstein, Tilo Burghardt, Charles V. Stewart},
257
 
258
  ## Acknowledgements
259
 
260
- This work was supported by the [Imageomics Institute](https://imageomics.org),
261
- which is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under
262
- [Award #2118240](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2118240)
263
- (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning).
264
- Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s)
265
- and do not necessarily reflect the views of the National Science Foundation.
266
 
 
267
 
268
 
269
  ## Model Card Authors
@@ -272,11 +203,4 @@ Jenna Kline and Maksim Kholiavchenko
272
 
273
  ## Model Card Contact
274
 
275
- Maksim Kholiavchenko
276
- <!-- Could include who to contact with questions, but this is also what the "Discussions" tab is for. -->
277
-
278
- ### Contributions
279
-
280
- This work was supported by the [Imageomics Institute](https://imageomics.org), which is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under [Award #2118240](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2118240) (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Additional support was also provided by the [AI Institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE)](https://icicle.osu.edu/), which is funded by the US National Science Foundation under [Award #2112606](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2112606). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
281
-
282
- The data was gathered at the [Mpala Research Centre](https://mpala.org/) in Kenya, in accordance with Research License No. NACOSTI/P/22/18214. The data collection protocol adhered strictly to the guidelines set forth by the Institutional Animal Care and Use Committee under permission No. IACUC 1835F.
 
17
  - UAV
18
  - drone
19
  - video
20
+ model_description: "Behavior recognition model for in situ drone videos of zebras and giraffes, built using X3D model initialized on Kinetics weights. It is trained on the KABR dataset, which is comprised of 10 hours of aerial video footage of reticulated giraffes (Giraffa reticulata), Plains zebras (Equus quagga), and Grevy’s zebras (Equus grevyi) captured using a DJI Mavic 2S drone. It includes both spatiotemporal (i.e., mini-scenes) and behavior annotations provided by an expert behavioral ecologist."
21
  ---
22
 
23
  # Model Card for X3D-KABR-Kinetics
 
34
 
35
  ### Model Description
36
 
37
+ - **Developed by:** Maksim Kholiavchenko, Maksim Kukushkin, Otto Brookes, Jenna Kline, Sam Stevens, Isla Duporge, Alec Sheets,
38
  Reshma R. Babu, Namrata Banerji, Elizabeth Campolongo,
39
  Matthew Thompson, Nina Van Tiel, Jackson Miliko,
40
  Eduardo Bessa Mirmehdi, Thomas Schmid,
41
+ Tanya Berger-Wolf, Daniel I. Rubenstein, Tilo Burghardt, Charles V. Stewart
42
 
43
+ - **Model type:** X3D-L
44
+ - **License:** MIT
45
+ - **Fine-tuned from model:** [X3D-L, Kinetics](https://github.com/facebookresearch/SlowFast/blob/main/configs/Kinetics/X3D_L.yaml)
46
 
47
  This model was developed for the benefit of the community as an open-source product, thus we request that any derivative products are also open-source.
48
 
 
89
 
90
  ### Training Data
91
 
92
+ This model was trained on the [KABR mini-scene dataset](https://huggingface.co/datasets/imageomics/KABR).
93
 
94
  ### Training Procedure
95
 
 
103
  detection in a video frame.
104
  This allows us to compensate for the drone's movement and provides a stable, zoomed-in representation of the animal.
105
 
106
+ See the [KBAR mini-scene project page](https://kabrdata.xyz/) and the [paper](https://openaccess.thecvf.com/content/WACV2024W/CV4Smalls/papers/Kholiavchenko_KABR_In-Situ_Dataset_for_Kenyan_Animal_Behavior_Recognition_From_Drone_WACVW_2024_paper.pdf) for data preprocessing details.
107
 
108
  We applied data augmentation techniques during training, including horizontal flipping to randomly
109
  mirror the input frames horizontally and color augmentations to randomly modify the
 
115
  We used the EQL loss function to address the long-tailed class distribution and SGD optimizer with a learning rate of 1e5.
116
  We used a sample rate of 16x5, and random weight initialization.
117
 
 
 
 
 
 
 
 
 
118
 
119
  ## Evaluation
120
 
121
+ The dataset was evaluated on the X3D-L model utilizing the [SlowFast](https://github.com/facebookresearch/SlowFast) framework, specifically utilizing teh [test_net script](https://github.com/facebookresearch/SlowFast/blob/main/tools/test_net.py).
 
 
 
 
 
 
 
 
 
 
 
122
 
123
+ ### Testing Data
124
 
125
+ We provide a train-test split of the mini-scenes from the [KABR Dataset](https://huggingface.co/datasets/imageomics/KABR) for evaluation purposes (test set indicated in [annotations/val.csv](https://huggingface.co/datasets/imageomics/KABR/blob/main/KABR/annotation/val.csv), with 75% for train and 25% for testing. No mini-scene was divided by the split. The splits ensured a stratified representation of giraffes, Plains zebras, and Grevy’s zebras.
 
 
126
 
127
  #### Metrics
128
 
129
+ We report precision, recall, and F1 score on the KABR mini-scene test set, along with the mean Average Precision (mAP) for overall, head-class, and tail-class performance.
 
 
 
 
 
 
 
 
 
 
130
 
131
+ **Results**
132
 
133
+ | WI | BS | mAP Overall | mAP Head | mAP Tail | P | R | F1 |
134
+ |----------|----|-------------|----------|----------|--------|--------|--------|
135
+ | K-400 | 64 | **66.36** | **96.96**| **56.16**| 66.44 | 63.65 | 64.70 |
136
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
  ### Model Architecture and Objective
139
 
140
+ Please see the [Base Model Description](https://arxiv.org/pdf/2004.04730).
 
 
 
 
141
 
142
  #### Hardware
143
 
144
+ Running the X3D model requires a modern NVIDIA GPU with CUDA support. X3D-L is designed to be computationally efficient, and requires 10–16 GB of GPU memory during training.
 
 
 
 
145
 
146
  ## Citation
147
 
 
 
 
 
 
 
 
 
148
  **BibTeX:**
149
 
 
150
  If you use our model in your work, please cite the model and associated paper.
151
 
152
  **Model**
 
192
 
193
  ## Acknowledgements
194
 
195
+ This work was supported by the [Imageomics Institute](https://imageomics.org), which is funded by the US National Science Foundation's Harnessing the Data Revolution (HDR) program under [Award #2118240](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2118240) (Imageomics: A New Frontier of Biological Information Powered by Knowledge-Guided Machine Learning). Additional support was also provided by the [AI Institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE)](https://icicle.osu.edu/), which is funded by the US National Science Foundation under [Award #2112606](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2112606). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
 
 
 
 
 
196
 
197
+ The data was gathered at the [Mpala Research Centre](https://mpala.org/) in Kenya, in accordance with Research License No. NACOSTI/P/22/18214. The data collection protocol adhered strictly to the guidelines set forth by the Institutional Animal Care and Use Committee under permission No. IACUC 1835F.
198
 
199
 
200
  ## Model Card Authors
 
203
 
204
  ## Model Card Contact
205
 
206
+ For questions on this model, please open a [discussion](https://huggingface.co/imageomics/x3d-kabr-kinetics/discussions) on this repo.