Vylepšení parserů restaurací #40

Open
opened 2026-02-04 14:29:19 +01:00 by batmanisko · 0 comments
Member

Feature: Restaurant Parser Improvements

Source: TODO.md lines 19-29

Implementation Notes

Improve the reliability and accuracy of restaurant menu scrapers/parsers.

Sub-tasks from TODO.md:

Sladovnická:

  • Unnecessary initial index validation – the specific day's date is also in the meal table itself (see TODO in parser)

U Motlíků:

  • Validate that the input date is included in the range shown above the table (e.g., '12.6.-16.6.')
  • Menu is fetched once per day, but theoretically once per week would suffice (assuming it doesn't change mid-week)

TechTower:

  • Validate that the input date is included in the range shown above the table (typically 'Obědy 12. 6. - 16. 6. 2023 (každý den vždy i obědový bufet)')
  • Menu is fetched on the first request of the day, but the website often updates only during Monday morning, so the displayed menu may be outdated
    • The page doesn't send a last-modified header, so that can't be relied upon
    • No clear solution other than more frequent scraping of the entire page

Approach:

  1. Review each parser for the issues listed above
  2. Add date range validation to U Motlíků and TechTower parsers
  3. Refactor Sladovnická parser to use date from the meal table directly
  4. Consider a smarter caching/refresh strategy for TechTower (e.g., re-scrape every few hours on Monday mornings)
## Feature: Restaurant Parser Improvements **Source:** TODO.md lines 19-29 ### Implementation Notes Improve the reliability and accuracy of restaurant menu scrapers/parsers. **Sub-tasks from TODO.md:** **Sladovnická:** - [ ] Unnecessary initial index validation – the specific day's date is also in the meal table itself (see TODO in parser) **U Motlíků:** - [ ] Validate that the input date is included in the range shown above the table (e.g., '12.6.-16.6.') - [ ] Menu is fetched once per day, but theoretically once per week would suffice (assuming it doesn't change mid-week) **TechTower:** - [ ] Validate that the input date is included in the range shown above the table (typically 'Obědy 12. 6. - 16. 6. 2023 (každý den vždy i obědový bufet)') - [ ] Menu is fetched on the first request of the day, but the website often updates only during Monday morning, so the displayed menu may be outdated - The page doesn't send a last-modified header, so that can't be relied upon - No clear solution other than more frequent scraping of the entire page **Approach:** 1. Review each parser for the issues listed above 2. Add date range validation to U Motlíků and TechTower parsers 3. Refactor Sladovnická parser to use date from the meal table directly 4. Consider a smarter caching/refresh strategy for TechTower (e.g., re-scrape every few hours on Monday mornings)
batmanisko added the Vylepšení label 2026-02-04 14:30:18 +01:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Marbes/Luncher#40