대회 소개
State Farm Distracted Driver Detection
Can computer vision spot distracted drivers?
https://www.kaggle.com/c/state-farm-distracted-driver-detection/overview
Description
We’ve all been there: a light turns green and the car in front of you doesn’t budge. Or, a previously unremarkable vehicle suddenly slows and starts swerving from side-to-side.
When you pass the offending driver, what do you expect to see? You certainly aren’t surprised when you spot a driver who is texting, seemingly enraptured by social media, or in a lively hand-held conversation on their phone.
According to the CDC motor vehicle safety division, one in five car accidents is caused by a distracted driver. Sadly, this translates to 425,000 people injured and 3,000 people killed by distracted driving every year.
State Farm hopes to improve these alarming statistics, and better insure their customers, by testing whether dashboard cameras can automatically detect drivers engaging in distracted behaviors. Given a dataset of 2D dashboard camera images, State Farm is challenging Kagglers to classify each driver’s behavior. Are they driving attentively, wearing their seatbelt, or taking a selfie with their friends in the backseat?
What to do
The 10 classes to predict are:
c0: normal driving
c1: texting - right
c2: talking on the phone - right
c3: texting - left
c4: talking on the phone - left
c5: operating the radio
c6: drinking
c7: reaching behind
c8: hair and makeup
c9: talking to passenger
Import Library
1 |
|
이미지 리스트 불러오기
1 |
|
subject | classname | img | |
---|---|---|---|
0 | p002 | c0 | img_44733.jpg |
1 | p002 | c0 | img_72999.jpg |
2 | p002 | c0 | img_25094.jpg |
3 | p002 | c0 | img_69092.jpg |
4 | p002 | c0 | img_92629.jpg |
... | ... | ... | ... |
22419 | p081 | c9 | img_56936.jpg |
22420 | p081 | c9 | img_46218.jpg |
22421 | p081 | c9 | img_25946.jpg |
22422 | p081 | c9 | img_67850.jpg |
22423 | p081 | c9 | img_9684.jpg |
22424 rows × 3 columns
Train Dataset 만들기
1 |
|
path | img | classname | |
---|---|---|---|
0 | ../input/state-farm-distracted-driver-detection/imgs/train/c5/img_68208.jpg | img_68208.jpg | c5 |
1 | ../input/state-farm-distracted-driver-detection/imgs/train/c5/img_77583.jpg | img_77583.jpg | c5 |
2 | ../input/state-farm-distracted-driver-detection/imgs/train/c5/img_49189.jpg | img_49189.jpg | c5 |
3 | ../input/state-farm-distracted-driver-detection/imgs/train/c5/img_6690.jpg | img_6690.jpg | c5 |
4 | ../input/state-farm-distracted-driver-detection/imgs/train/c5/img_95740.jpg | img_95740.jpg | c5 |
... | ... | ... | ... |
22419 | ../input/state-farm-distracted-driver-detection/imgs/train/c0/img_6087.jpg | img_6087.jpg | c0 |
22420 | ../input/state-farm-distracted-driver-detection/imgs/train/c0/img_36959.jpg | img_36959.jpg | c0 |
22421 | ../input/state-farm-distracted-driver-detection/imgs/train/c0/img_19429.jpg | img_19429.jpg | c0 |
22422 | ../input/state-farm-distracted-driver-detection/imgs/train/c0/img_99342.jpg | img_99342.jpg | c0 |
22423 | ../input/state-farm-distracted-driver-detection/imgs/train/c0/img_48589.jpg | img_48589.jpg | c0 |
22424 rows × 3 columns
1 |
|
1 |
|
데이터 확인
1 |
|
Test Dataset 만들기
1 |
|
path | img | |
---|---|---|
0 | ../input/state-farm-distracted-driver-detection/imgs/test/img_96590.jpg | img_96590.jpg |
1 | ../input/state-farm-distracted-driver-detection/imgs/test/img_32366.jpg | img_32366.jpg |
2 | ../input/state-farm-distracted-driver-detection/imgs/test/img_99675.jpg | img_99675.jpg |
3 | ../input/state-farm-distracted-driver-detection/imgs/test/img_85937.jpg | img_85937.jpg |
4 | ../input/state-farm-distracted-driver-detection/imgs/test/img_73903.jpg | img_73903.jpg |
... | ... | ... |
79721 | ../input/state-farm-distracted-driver-detection/imgs/test/img_109.jpg | img_109.jpg |
79722 | ../input/state-farm-distracted-driver-detection/imgs/test/img_53257.jpg | img_53257.jpg |
79723 | ../input/state-farm-distracted-driver-detection/imgs/test/img_90376.jpg | img_90376.jpg |
79724 | ../input/state-farm-distracted-driver-detection/imgs/test/img_28000.jpg | img_28000.jpg |
79725 | ../input/state-farm-distracted-driver-detection/imgs/test/img_93083.jpg | img_93083.jpg |
79726 rows × 2 columns
1 |
|
1 |
|
K-fold Cross Validation(교차검증)으로 모델 학습하기
1 |
|
모든 데이터셋을 학습시켜 보다 안정적인 모델 성능 뽑아내기
- 이 대회에서 교차검증 없이 모델을 학습하면 특정
subject
(운전자)에 과적합되는 문제가 발생한다. - 모델이
classname
을 예측할 때subject
에 크게 영향을 받을 수 있다는 것이다. train_test_split
에서stratify
옵션을 주어 학습을 해도, 데이터의 일부분만을 추출하는 것이기 때문에 Test 데이터의 분포가 Valid 데이터셋의 분포와 다를 수 있다.- 결국 일정 부분의 학습 데이터를 잃는 것이기에, 전체 데이터셋을 모두 훈련에 사용하고 싶을 때 교차 검증을 진행한다.
- k-fold Cross Validation 특징
- 전체 데이터를 훈련 및 평가할 수 있음.
- 앙상블의 효과를 얻을 수 있다. k-fold 학습마다 독립적인 모델을 학습한다라고 생각하면 쉽다.
- 같은 작업을 여러번 반복하는 것이기에 시간은 오래 걸린다.
1 |
|
제출
- sample_submission.csv 에서 각 이미지의 순서와 result의 이미지 순서가 다르기 때문에 새로 Dataframe을 만드는 것이 편하다.
1 |
|
c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8 | c9 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 4.898581e-03 | 6.054179e-03 | 5.146383e-03 | 5.325366e-04 | 7.062429e-04 | 0.012463 | 0.642532 | 2.142958e-02 | 3.042254e-01 | 0.002012 |
1 | 4.047657e-08 | 4.258116e-08 | 7.385115e-08 | 7.353977e-08 | 5.030099e-08 | 0.999987 | 0.000007 | 2.678771e-08 | 2.975033e-07 | 0.000006 |
2 | 4.653947e-01 | 9.502587e-03 | 3.015550e-02 | 7.141132e-03 | 8.581814e-03 | 0.001376 | 0.228859 | 1.078819e-03 | 2.433791e-01 | 0.004531 |
3 | 8.255903e-05 | 1.916165e-05 | 1.645608e-05 | 2.139845e-03 | 2.219971e-04 | 0.990963 | 0.000140 | 5.377778e-06 | 3.694886e-05 | 0.006375 |
4 | 3.502117e-07 | 1.061771e-04 | 4.572770e-05 | 2.403757e-06 | 1.187768e-05 | 0.000053 | 0.000408 | 9.991777e-01 | 1.797632e-04 | 0.000014 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
79721 | 7.563246e-03 | 1.897852e-02 | 2.062093e-01 | 1.308421e-03 | 5.482526e-03 | 0.002233 | 0.118231 | 1.006306e-01 | 5.321654e-01 | 0.007198 |
79722 | 2.018641e-03 | 3.363243e-04 | 4.372783e-04 | 4.747344e-01 | 4.001005e-01 | 0.004480 | 0.000733 | 3.581994e-04 | 8.031728e-04 | 0.115998 |
79723 | 8.465707e-02 | 5.784687e-01 | 3.608754e-03 | 3.662695e-02 | 1.371328e-02 | 0.011587 | 0.017968 | 7.717087e-02 | 3.644365e-02 | 0.139756 |
79724 | 6.226664e-08 | 4.983243e-08 | 7.004503e-09 | 2.172765e-08 | 6.979324e-08 | 0.999966 | 0.000004 | 6.643240e-09 | 2.172691e-06 | 0.000028 |
79725 | 5.966864e-02 | 2.801355e-02 | 7.123795e-03 | 6.038536e-04 | 2.516853e-03 | 0.004967 | 0.027987 | 1.176820e-03 | 8.555837e-01 | 0.012359 |
79726 rows × 10 columns
1 |
|
img | c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8 | c9 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | img_1.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
1 | img_10.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
2 | img_100.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
3 | img_1000.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
4 | img_100000.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
79721 | img_99994.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
79722 | img_99995.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
79723 | img_99996.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
79724 | img_99998.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
79725 | img_99999.jpg | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 |
79726 rows × 11 columns
1 |
|
img | c0 | c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8 | c9 | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | img_96590.jpg | 4.898581e-03 | 6.054179e-03 | 5.146383e-03 | 5.325366e-04 | 7.062429e-04 | 0.012463 | 0.642532 | 2.142958e-02 | 3.042254e-01 | 0.002012 |
1 | img_32366.jpg | 4.047657e-08 | 4.258116e-08 | 7.385115e-08 | 7.353977e-08 | 5.030099e-08 | 0.999987 | 0.000007 | 2.678771e-08 | 2.975033e-07 | 0.000006 |
2 | img_99675.jpg | 4.653947e-01 | 9.502587e-03 | 3.015550e-02 | 7.141132e-03 | 8.581814e-03 | 0.001376 | 0.228859 | 1.078819e-03 | 2.433791e-01 | 0.004531 |
3 | img_85937.jpg | 8.255903e-05 | 1.916165e-05 | 1.645608e-05 | 2.139845e-03 | 2.219971e-04 | 0.990963 | 0.000140 | 5.377778e-06 | 3.694886e-05 | 0.006375 |
4 | img_73903.jpg | 3.502117e-07 | 1.061771e-04 | 4.572770e-05 | 2.403757e-06 | 1.187768e-05 | 0.000053 | 0.000408 | 9.991777e-01 | 1.797632e-04 | 0.000014 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
79721 | img_109.jpg | 7.563246e-03 | 1.897852e-02 | 2.062093e-01 | 1.308421e-03 | 5.482526e-03 | 0.002233 | 0.118231 | 1.006306e-01 | 5.321654e-01 | 0.007198 |
79722 | img_53257.jpg | 2.018641e-03 | 3.363243e-04 | 4.372783e-04 | 4.747344e-01 | 4.001005e-01 | 0.004480 | 0.000733 | 3.581994e-04 | 8.031728e-04 | 0.115998 |
79723 | img_90376.jpg | 8.465707e-02 | 5.784687e-01 | 3.608754e-03 | 3.662695e-02 | 1.371328e-02 | 0.011587 | 0.017968 | 7.717087e-02 | 3.644365e-02 | 0.139756 |
79724 | img_28000.jpg | 6.226664e-08 | 4.983243e-08 | 7.004503e-09 | 2.172765e-08 | 6.979324e-08 | 0.999966 | 0.000004 | 6.643240e-09 | 2.172691e-06 | 0.000028 |
79725 | img_93083.jpg | 5.966864e-02 | 2.801355e-02 | 7.123795e-03 | 6.038536e-04 | 2.516853e-03 | 0.004967 | 0.027987 | 1.176820e-03 | 8.555837e-01 | 0.012359 |
79726 rows × 11 columns
1 |
|
- Final score which evaluated with multi-class logarithmic loss is 0.18165.