Buran 🌀

Buran (meaning "Snowstorm" or "Blizzard") was the first spaceplane to be produced as part of the Soviet/Russian Buran programme. Wikipedia

Buran 🌀

Parse and generate RSS/Atom feeds in Clojure

Buran is a bidirectional feed library: parse any RSS/Atom feed into Clojure data structures, transform them with standard functions, and produce feeds in any format. Built on ROME Tools with a data-driven approach.

Buran can be used as an aggregator for various feed formats, converting them into regular Clojure data structures. When consuming a feed, Buran creates a map, which can be read or manipulated using regular functions such as filter, sort, assoc, dissoc, and more. After the modifications, Buran can generate your own feed, for example, in a different format (RSS 2.0, 1.0, 0.9x or Atom 1.0, 0.3).

Quick Start

;; Add to deps.edn
{:deps {buran/buran {:mvn/version "0.1.4"}}}

;; Or to project.clj
[buran "0.1.4"]

;; In your namespace
(ns your.app
  (:require [buran.core :as buran]))

;; Parse a feed
(def data (buran/consume-http "https://stackoverflow.com/feeds/tag?tagnames=clojure"))

;; Generate a feed
(buran/produce {:info {:feed-type "atom_1.0" :title "My Feed"}
                :entries [{:title "Hello" :description {:value "World"}}]})

Usage

Regardless of the feed format you are working with and whether you want to consume or produce a new feed, Buran uses the same data structure every time. Buran's API is concise, with functions such as consume, consume-http, produce, and some helpers to manipulate feeds, including combine-feeds, filter-entries, sort-entries-by and shrink. The basic workflow involves passing the data structure to the API functions repeatedly. See the documentation for Various options and details.

examples

Consume a feed from String

(def feed "<?xml version=\"1.0\" encoding=\"UTF-8\"?>
           <feed xmlns=\"http://www.w3.org/2005/Atom\">
             <title>Feed title</title>
             <subtitle />
             <entry>
               <title>Entry title</title>
               <author>
                 <name />
               </author>
               <summary>entry description</summary>
             </entry>
           </feed>
           ")
(shrink (consume feed))
=>
{:info    {:feed-type "atom_1.0", 
           :title     "Feed title"},
 :entries [{:title       "Entry title", 
            :description {:value "entry description"}}]}

Produce a feed

(def feed {:info {:feed-type "atom_1.0"
                  :title     "Feed title"}
           :entries [{:title       "Entry title"
                      :description {:value "entry description"}}]})
(produce feed)
=>
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
 <feed xmlns=\"http://www.w3.org/2005/Atom\">\r
   <title>Feed title</title>\r
   <subtitle />\r
   <entry>\r
     <title>Entry title</title>\r
     <author>\r
       <name />\r
     </author>\r
     <summary>entry description</summary>\r
   </entry>\r
 </feed>
 "

Consume a feed over http

(consume-http "https://stackoverflow.com/feeds/tag?tagnames=clojure")
=>
{:info {...},
 :entries [...],
 :foreign-markup [...]}

Shrink a feed (remove nils, empty colls, maps and etc.)

(shrink (consume-http "https://stackoverflow.com/feeds/tag?tagnames=clojure"))
=>
{:info {:description "most recent 30 from stackoverflow.com",
        :feed-type "atom_1.0",
        :published-date #inst"2018-08-20T08:03:33.000-00:00",
        :title "Active questions tagged clojure - Stack Overflow",
        :link "https://stackoverflow.com/questions/tagged/?tagnames=clojure&sort=active",
        :uri "https://stackoverflow.com/feeds/tag?tagnames=clojure",
        :links [{:href "https://stackoverflow.com/questions/tagged/?tagnames=clojure&sort=active",
                 :type "text/html",
                 :rel "alternate",
                 :length 0}, ...]},
 :entries [{:description {:type "html", :value "<p>..."},
            :updated-date #inst"2018-08-20T06:16:12.000-00:00",
            :foreign-markup [...],
            :published-date #inst"2018-08-20T05:54:39.000-00:00",
            :title "Clojure evaluate lazy sequence",
            :author "Constantine",
            :categories [{:name "clojure", :taxonomy-uri "https://stackoverflow.com/tags"}, ...],
            :link "https://stackoverflow.com/questions/51924808/clojure-evaluate-lazy-sequence",
            :uri "https://stackoverflow.com/q/51924808",
            :authors [{:name "Constantine", :uri "https://stackoverflow.com/users/4201205"}],
            :links [{:href "https://stackoverflow.com/questions/51924808/clojure-evaluate-lazy-sequence",
                     :rel "alternate",
                     :length 0}]}, ...],
 :foreign-markup [...]}

Supported Formats

Format	Parse	Generate	Notes
Atom 1.0	✅	✅	Full support
Atom 0.3	✅	✅	Legacy
RSS 2.0	✅	✅	Most common
RSS 1.0	✅	✅	RDF-based
RSS 0.9x	✅	✅	Various variants
RSS 0.9	✅	✅	Original

Basic API Reference

`consume`

Parse a feed from string, file, reader, or other sources.

;; Shortcut
(consume "<?xml version=\"1.0\"?><feed>...</feed>")

;; With options
(consume {:from             (java.io.File. "~/feed.xml") 
                                        ; String, File, Reader, W3C DOM document, JDOM document, W3C SAX InputSource
          :validate         false       ; Indicates if the input should be validated
          :locale           (Locale/US) ; java.util.Locale
          :xml-healer-on    true        ; Healing trims leading chars from the stream (empty spaces and comments) until the XML prolog.
                                        ; Healing resolves HTML entities (from literal to code number) in the reader.
                                        ; The healing is done only with the File and Reader.
          :allow-doctypes   false       ; You should only activate it when the feeds that you process are absolutely trustful
          :throw-exception  false       ; false - return map with an exception, throw an exception otherwise
         })

Option	Type	Default	Description
`:from`	String, File, Reader, InputStream, W3C DOM, JDOM, SAX InputSource	required	Source to parse
`:validate`	boolean	`false`	Validate XML against DTD/schema
`:locale`	`java.util.Locale`	`(Locale/US)`	Locale for parsing
`:xml-healer-on`	boolean	`true`	Trim whitespace/comments before XML prolog; resolve HTML entities
`:allow-doctypes`	boolean	`false`	Allow DOCTYPE declarations (⚠️ security risk - only for trusted sources)
`:throw-exception`	boolean	`false`	If `false`, return error map; if `true`, throw exception

`consume-http`

Fetch and parse a feed over HTTP.

;; Shortcut
(consume-http "https://example.com/feed.xml")

;; With options
(consume-http {:from             "https://stackoverflow.com/feeds/tag?tagnames=clojure" 
                                                      ; <http url string>, URL, File, InputStream
               :headers          {"X-Header" "Value"} ; Request's HTTP headers map
               :lenient          true                 ; Indicates if the charset encoding detection should be relaxed
               :default-encoding "US-ASCII"           ; Supports: UTF-8, UTF-16, UTF-16BE, UTF-16LE, CP1047, US-ASCII
               ... 
               + All options applied to a (consume) call.
              })

Option	Type	Default	Description
`:from`	String URL, `java.net.URL`, File, InputStream	required	URL or source to fetch
`:headers`	map	`{}`	HTTP headers (e.g., `{"User-Agent" "MyApp"}`)
`:lenient`	boolean	`true`	Relaxed charset encoding detection
`:default-encoding`	String	`"US-ASCII"`	Fallback encoding: `UTF-8`, `UTF-16`, `UTF-16BE`, `UTF-16LE`, `CP1047`, `US-ASCII`
`:content-type`	String	`nil`	Override Content-Type header (used with InputStream)

Beware! consume-http from either http url string or URL is rudimentary and works only for simplest cases. For instance, it does not follow HTTP 302 redirects. Please consider using a separate library like clj-http or http-kit for fetching the feed.

`produce`

Generate RSS/Atom feed as string, file, or DOM.

(produce {:feed            {:info {:feed-type "atom_1.0" ; Supports: atom_1.0, atom_0.3, rss_2.0, 
                                                         ; rss_1.0, rss_0.94, rss_0.93, rss_0.92, 
                                                         ; rss_0.91U (Userland), rss_0.91N (Netscape), 
                                                         ; rss_0.9
                                   :title "Feed title"}
                            :entries [{:title       "Entry 1 title"
                                       :description {:value "entry description"}}]
                            :foreign-markup nil}

          :to              :string ; <file path string>, :string, :w3cdom, :jdom, File, Writer
          :pretty-print    true    ; Pretty-print XML output
          :throw-exception false   ; false - return map with an exception, throw an exception otherwise
         })

Option	Type	Default	Description
`:feed`	map	`nil` (uses argument as feed)	Feed data structure to generate
`:to`	`:string`, `:w3cdom`, `:jdom`, String (file path), File, Writer	`:string`	Output destination
`:pretty-print`	boolean	`true`	Pretty-print XML output
`:throw-exception`	boolean	`false`	If `false`, return error map; if `true`, throw exception

`shrink`

Remove nil values and empty collections from feed data.

(shrink feed)

License

Distributed under the MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.circleci		.circleci
pic		pic
src/buran		src/buran
test/buran		test/buran
.cljfmt.edn		.cljfmt.edn
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
project.clj		project.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Buran 🌀

Quick Start

Usage

examples

Supported Formats

Basic API Reference

`consume`

`consume-http`

`produce`

`shrink`

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Buran 🌀

Quick Start

Usage

examples

Supported Formats

Basic API Reference

consume

consume-http

produce

shrink

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

`consume`

`consume-http`

`produce`

`shrink`

Packages