Arik Jones (aider)
|
f1f0bd3895
|
feat: add MaxDepth to URL-based configuration and use outputType directly
|
2024-09-19 16:36:24 -05:00 |
|
Arik Jones (aider)
|
57bbc5a1ac
|
refactor: update writeMultipleFiles to create single output file
|
2024-09-19 16:35:46 -05:00 |
|
Arik Jones (aider)
|
32499abbc0
|
fix: improve URL parsing and title extraction in getFilenameFromContent
|
2024-09-19 16:33:55 -05:00 |
|
Arik Jones (aider)
|
237ed512fc
|
fix: handle error from getFilenameFromContent in writeMultipleFiles
|
2024-09-19 16:32:30 -05:00 |
|
Arik Jones (aider)
|
7c8fcc3261
|
fix: update getFilenameFromContent to handle invalid URLs and use .rollup.md suffix
|
2024-09-19 16:31:43 -05:00 |
|
Arik Jones (aider)
|
30e11153f9
|
refactor: update getFilenameFromContent to remove http from filenames
|
2024-09-19 16:29:55 -05:00 |
|
Arik Jones (aider)
|
0219881f61
|
fix: remove unused import in cmd/web_test.go
|
2024-09-19 16:26:38 -05:00 |
|
Arik Jones (aider)
|
c77ae918c5
|
refactor: remove redundant variable declarations in test file
|
2024-09-19 16:25:30 -05:00 |
|
Arik Jones (aider)
|
1b696ce9c6
|
refactor: use wrapper functions for easier testing
|
2024-09-19 16:25:02 -05:00 |
|
Arik Jones (aider)
|
df1178cb03
|
test: refactor TestScrapeURL to use local mock functions
|
2024-09-19 16:24:23 -05:00 |
|
Arik Jones (aider)
|
c4831dfea2
|
fix: resolve compilation errors in web_test.go
|
2024-09-19 16:23:40 -05:00 |
|
Arik Jones (aider)
|
3c22d8034d
|
fix: correct import path and update Config struct usage in test
|
2024-09-19 16:23:01 -05:00 |
|
Arik Jones (aider)
|
c7791814c9
|
fix: add missing imports and correct Config reference in files_test.go
|
2024-09-19 16:21:11 -05:00 |
|
Arik Jones (aider)
|
e184cef444
|
test: add unit tests for cmd and internal packages
|
2024-09-19 16:15:32 -05:00 |
|
Arik Jones (aider)
|
702665bb2e
|
fix: import config package to resolve undefined error
|
2024-09-19 16:12:30 -05:00 |
|
Arik Jones (aider)
|
1d02cab585
|
fix: resolve type mismatch for PathOverrides in SiteConfig
|
2024-09-19 16:11:14 -05:00 |
|
Arik Jones (aider)
|
e3fddf101c
|
fix: resolve undefined types and import issues in scraper.go
|
2024-09-19 16:10:06 -05:00 |
|
Arik Jones (aider)
|
569ff9924d
|
feat: implement site-based scraping with path overrides
|
2024-09-19 16:06:55 -05:00 |
|
Arik Jones (aider)
|
1d38e4157c
|
fix: add Scrape field to Config struct and create ScrapeConfig
|
2024-09-19 15:23:35 -05:00 |
|
Arik Jones (aider)
|
d44fabf783
|
feat: implement rate limiting for URL scraping
|
2024-09-19 15:22:02 -05:00 |
|
Arik Jones
|
f9eee282bc
|
docs: Update readme to include generate command that produces default config file.
|
2024-09-19 12:22:42 -05:00 |
|
Arik Jones (aider)
|
fca1422104
|
refactor: improve generate command and use config package
|
2024-09-19 12:08:34 -05:00 |
|
Arik Jones (aider)
|
2e563836f3
|
feat: add generate subcommand for creating rollup.yml config
|
2024-09-19 12:08:09 -05:00 |
|
Arik Jones
|
160a15dbb1
|
fix: Use logger instead of log. Move web subcommand initialization to root.go
|
2024-09-19 11:44:27 -05:00 |
|
Arik Jones (aider)
|
eabf1ba23f
|
feat: add files subcommand and refactor rollup functionality
|
2024-09-19 11:38:09 -05:00 |
|
Arik Jones
|
1e88fae75d
|
docs: Update the readme
|
2024-09-19 11:08:13 -05:00 |
|
Arik Jones
|
eba453f09e
|
fix: rollup output file name (again)
|
2024-09-19 11:02:35 -05:00 |
|
Arik Jones
|
d3ba28d03b
|
fix: Output markdown files should end in *.rollup.md
|
2024-09-19 10:56:00 -05:00 |
|
Arik Jones
|
197f3affc7
|
fix: Don't use PersistentPreRunE. Caused the actuall runRollup function to never run.
|
2024-09-19 10:43:23 -05:00 |
|
Arik Jones (aider)
|
7f468a05bd
|
feat: install only Chromium browser
|
2024-09-17 14:51:09 -05:00 |
|
Arik Jones (aider)
|
4586b5daaa
|
fix: Install Playwright and browsers before initializing
|
2024-09-17 14:48:15 -05:00 |
|
Arik Jones (aider)
|
056c3e368e
|
fix: Update import and usage of Config type in cmd/root.go
|
2024-09-16 09:53:44 -05:00 |
|
Arik Jones (aider)
|
21d3e8ee68
|
fix: Handle missing configuration file for help command
|
2024-09-16 09:52:48 -05:00 |
|
Arik Jones (aider)
|
efee186ae0
|
fix: Skip config loading and rollup execution for help command
|
2024-09-16 09:52:25 -05:00 |
|
Arik Jones
|
41fb9e3fad
|
Correction in web scraping example.
|
2024-09-14 21:38:17 -05:00 |
|
Arik Jones (aider)
|
6cb2f03d74
|
feat: Add web scraping functionality and exclusionary CSS paths
|
2024-09-14 21:26:59 -05:00 |
|
Arik Jones
|
bb12e3d029
|
fix: Something in root
|
2024-09-14 21:25:50 -05:00 |
|
Arik Jones (aider)
|
53dcd6eb71
|
feat: Add support for exclusionary CSS paths in config.go
|
2024-09-14 20:59:08 -05:00 |
|
Arik Jones (aider)
|
ece9492b30
|
fix: Remove unused import in cmd/web.go
|
2024-09-14 20:56:51 -05:00 |
|
Arik Jones (aider)
|
c1755836b5
|
fix: Move HTML to Markdown conversion to scraper.go
|
2024-09-14 20:55:35 -05:00 |
|
Arik Jones
|
939cffb55e
|
fix: Simplify sanitizeFilename function
|
2024-09-14 20:55:34 -05:00 |
|
Arik Jones (aider)
|
b6de9d211b
|
fix: Merge duplicate runWeb function and add missing function definitions
|
2024-09-14 20:42:10 -05:00 |
|
Arik Jones (aider)
|
a6ebf0062a
|
fix: Add --verbose flag to web subcommand
|
2024-09-14 20:41:23 -05:00 |
|
Arik Jones (aider)
|
aaff602b3e
|
fix: Use local getFilenameFromContent function instead of undefined scraper.GetFilenameFromContent
|
2024-09-14 20:38:06 -05:00 |
|
Arik Jones (aider)
|
6f4750c900
|
fix: Remove references to non-existent CSSLocator field in Config struct
|
2024-09-14 20:36:31 -05:00 |
|
Arik Jones (aider)
|
52c7de255d
|
feat: Implement scraping of multiple URLs with optional CSS locators and separate output files
|
2024-09-14 20:35:35 -05:00 |
|
Arik Jones (aider)
|
5264023cba
|
feat: add MIT license
|
2024-09-14 20:15:05 -05:00 |
|
Arik Jones (aider)
|
87c2a81375
|
feat: Add README.md
|
2024-09-14 20:13:00 -05:00 |
|
Arik Jones (aider)
|
b1db362a94
|
fix: Initialize logger before calling InitPlaywright
|
2024-09-14 19:59:39 -05:00 |
|
Arik Jones (aider)
|
23508df6f4
|
feat: Add optional logging to the scraper
|
2024-09-14 19:59:02 -05:00 |
|