Html, css und js nach Links durchsuchen

Hallo,

nachdem ich recht frustriert festgestellt haben dass httrack nicht nutzbar ist (Bug im Proxy) suche ich aktuell nach einer Software/Library unter Linux welche in HTML, CSS und JS-Dateien nach weiterführenden Links sucht.

HTML ist recht einfach.
JS und CSS weniger, besonders wenn die URLs teilweise maskiert sind.
Und HTML enthält manchmal auch inline HTML und JS.

Jemand Vorschläge oder gar Erfahrungen?

Danke

Stefan

anbei mal ein negatives Beispiel aus einer CSS-Datei:

background: url ..\/images\/tile.jpg;

Please also mark the comments that contributed to the solution of the article

Content-Key: 565856

Url: https://administrator.de/contentid/565856

Printed on: April 19, 2024 at 13:04 o'clock

3 Comments

Latest comment

Hallo,
Regex ist dein freund

Hallo Stefan,

schau Dir mal scrapy an.
Da kannst Du auch Deine eigene crawler configs schreiben, basiert auf python.
Kann html, css, js etc.

Vielleicht hilft Dir das weiter.

Gruss

Zitat von @godlie:
Regex ist dein freund

1. Regex ist definitiv nicht mein Freund und ich habe es echt versucht

2. Ein kurzer Blick zeigt mir dass der Aufwand mit Regex recht groß ist. Es gibt bestimmt 20-30 offizielle Syntaxe wo URLs verwendet werden und für jeden bestimmt noch 10 Varianten oder fehlerhafte aber verbreitete Schreibweisen.

German Question CSS Development

Hotly discussed

Check of ZFW Firewallgleixnerd - 3 Comments

How to set up and configure a Linux GRE tunnelAlexWisha - 3 Comments

WIREGUARD VPN ON UDM PRO BEHIND FRITZBOX - HANDSHAKE DID NOT COMPLETEjstricker - 3 Comments

End of Support dates for Office 2016, 2019 Apps und Productivity ServersDani - 1 Comment