An undergraduate course on data mining.
This project is maintained by chatox
You will need:
import nlkt
and nltk.download('punkt')
Please, if you run into problems installing this software, ask in the course forum. Please do not ask the practice instructors, they absolutely do not have the bandwidth for this.
Practice sessions are conducted with a computer.
There are 09 practice sessions in this course, the handouts are Python notebooks. Download the notebooks, open them, and follow the instructions there. Each session starts with psNN and describe the activities that the students must perform during the practice session.
Read the practice descriptions before the session, as they can be sometimes long. You can start working on these at any point, but they are not definitive until the end of the session; details may change.
Some parts are not visible in the preview shown on the GitHub website, so you need to download the notebook to see the instructions.
At the end of each handout there is a description of what you should deliver. Please ask in the course forum or to your practice instructor (“profesor/a de prácticas”) any questions you may have.
# | Handouts | Contents | Deadline 101 | Deadline 102 | Deadline 103 |
---|---|---|---|---|---|
1 | PS01+PS02 | Data preparation (two sessions, grade x 2) | 24H after session 2 | 24H after session 2 | 24H after session 2 |
2 | PS01+PS02 | Wrap-up | ———– | ———– | ———– |
3 | PS03 | Near-duplicate detection | 24H after session 5 | 24H after session 5 | 24H after session 5 |
4 | PS04 | Association rules mining | 24H after session 5 | 24H after session 5 | 24H after session 5 |
5 | PS03+PS04 | Wrap-up | ———– | ———– | ———– |
6 | PS05 | Content-based recommendations | 24H after session 8 | 24H after session 8 | 24H after session 8 |
7 | PS06 | Item-based similarity recommendations | 24H after session 8 | 24H after session 8 | 24H after session 8 |
8 | PS05+PS06 | Wrap-up | ———– | ———– | ———– |
9 | PS07 | Outlier analysis | 24H after session 12 | 24H after session 12 | 24H after session 12 |
10 | PS08 | Data streams | 24H after session 12 | 24H after session 12 | 24H after session 12 |
11 | PS09 | Time series forecasting | 24H after session 12 | 24H after session 12 | 24H after session 12 |
12 | PS07+PS08+PS09 | Wrap-up | ———– | ———– | ———– |