jsoup - Java HTML parser

Distribution: RPM Universal
Repository: JPackage 6.0 all
Package name: jsoup
Package version: 1.6.2
Package release: 1.jpp6
Package architecture: noarch
Package type: rpm
Installed size: 291.93 KB
Download size: 266.91 KB
Official Mirror: mirrors.dotsrc.org
Jsoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. Jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. * scrape and parse HTML from a URL, file, or string * find and extract data, using DOM traversal or CSS selectors * manipulate the HTML elements, attributes, and text * clean user-submitted content against a safe white-list, to prevent XSS attacks * output tidy HTML Jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree.



  • jsoup = 1.6.2-1.jpp6


    Install Howto

    Fedora, CentOS, RHEL:
    1. Download the latest jpackage-release rpm from
    2. Install jpackage-release rpm:
      # rpm -Uvh jpackage-release*rpm
    3. Install jsoup rpm package:
      # yum install jsoup
    1. Add the JPackage 6.0 repository:
      # zypper addrepo http://mirrors.dotsrc.org/jpackage/6.0/generic/free/ jpackage-6.0
    2. Install jsoup rpm package:
      # zypper install jsoup
    Mandriva, Mageia:
    1. Add the JPackage 6.0 repository:
      # urpmi.addmedia jpackage-6.0 http://mirrors.dotsrc.org/jpackage/6.0/generic/free/ with hdlist.cz
    2. Update packages list:
      # urpmi.update -a
    3. Install jsoup rpm package:
      # urpmi jsoup


    • /etc/maven/fragments/jsoup
    • /usr/share/java/jsoup-1.6.2.jar
    • /usr/share/java/jsoup.jar
    • /usr/share/maven2/poms/JPP-jsoup.pom


    2012-06-05 - Ralph Apel <r.apel@r-apel.de> 1.6.2-1 - 1.6.2