rollup

mirror of https://github.com/tnypxl/rollup.git synced 2025-12-15 15:03:17 +00:00

Author	SHA1	Message	Date
Arik Jones (aider)	d5a94f5468	fix: remove indentation while preserving HTML structure in ExtractContentWithCSS	2024-09-22 17:00:16 -05:00
Arik Jones (aider)	59994c085c	fix: improve file ignore logic and preserve newlines in extracted content	2024-09-22 16:58:53 -05:00
Arik Jones (aider)	364b185269	fix: resolve test failures in TestRunRollup, TestExtractContentWithCSS, and TestExtractLinks	2024-09-21 16:04:20 -05:00
Arik Jones (aider)	952c2dda02	refactor: update browser initialization in scraper tests	2024-09-21 16:01:51 -05:00
Arik Jones	73116e8d82	Fix logging and other issues from preventing scraping	2024-09-21 15:54:33 -05:00
Arik Jones	160a15dbb1	fix: Use logger instead of log. Move web subcommand initialization to root.go	2024-09-19 11:44:27 -05:00
Arik Jones (aider)	7f468a05bd	feat: install only Chromium browser	2024-09-17 14:51:09 -05:00
Arik Jones (aider)	4586b5daaa	fix: Install Playwright and browsers before initializing	2024-09-17 14:48:15 -05:00
Arik Jones (aider)	53dcd6eb71	feat: Add support for exclusionary CSS paths in config.go	2024-09-14 20:59:08 -05:00
Arik Jones (aider)	c1755836b5	fix: Move HTML to Markdown conversion to scraper.go	2024-09-14 20:55:35 -05:00
Arik Jones (aider)	6f4750c900	fix: Remove references to non-existent CSSLocator field in Config struct	2024-09-14 20:36:31 -05:00
Arik Jones (aider)	52c7de255d	feat: Implement scraping of multiple URLs with optional CSS locators and separate output files	2024-09-14 20:35:35 -05:00
Arik Jones (aider)	23508df6f4	feat: Add optional logging to the scraper	2024-09-14 19:59:02 -05:00
Arik Jones	01d6b2f54f	fix: Improve page content extraction in scraper	2024-09-14 19:59:01 -05:00
Arik Jones (aider)	3378402fb9	fix: Handle missing content in ProcessHTMLContent	2024-09-14 19:43:58 -05:00
Arik Jones	2ab0d74279	fix: Update scraper to handle empty URLs	2024-09-14 19:42:38 -05:00
Arik Jones (aider)	eaa7135eab	feat: Improve content extraction with fallback to body	2024-09-14 17:05:05 -05:00
Arik Jones (aider)	7cdd68d020	feat: Separate include and exclude selectors in web scraper	2024-09-14 16:59:59 -05:00
Arik Jones (aider)	39e06ee9d5	fix: remove space between minus and CSS path in parseSelectors	2024-09-14 16:54:34 -05:00
Arik Jones (aider)	d66fd04016	fix: Use `-` instead of `!` to filter unwanted elements	2024-09-14 16:53:42 -05:00
Arik Jones (aider)	56d5a8a194	refactor: Remove XPath support	2024-09-14 16:51:18 -05:00
Arik Jones (aider)	09f8ed07c2	fix: Remove unused variable `excludeXPaths` in `ExtractContentWithXPath` function	2024-09-14 16:50:34 -05:00
Arik Jones (aider)	f1af20e95e	feat: Add support for excluding child elements in content extraction	2024-09-14 16:49:32 -05:00
Arik Jones (aider)	d0ee666b07	refactor: Modify scraper to capture only the main content	2024-09-14 15:20:15 -05:00
Arik Jones (aider)	1a57be80fa	fix: Remove print media emulation and improve CSS selector extraction	2024-09-14 15:14:53 -05:00
Arik Jones (aider)	ea12ad631c	fix: Fix assignment mismatch in ExtractContentWithCSS function	2024-09-14 14:54:04 -05:00
Arik Jones (aider)	885f3fc2b8	feat: Add missing scraper functions	2024-09-14 14:52:45 -05:00
Arik Jones	0163c4e504	Adds a configuration layer for use rollup.yml which may be preferred over CLI flags.	2024-09-05 23:41:39 -05:00

28 Commits