Title: | Package Directives and Collaboration Networks in CRAN |
---|---|
Description: | Core visualizations and summaries for the CRAN package database. The package provides comprehensive methods for cleaning up and organizing the information in the CRAN package database, for building package directives networks (depends, imports, suggests, enhances, linking to) and collaboration networks, producing package dependence trees, and for computing useful summaries and producing interactive visualizations from the resulting networks and summaries. The resulting networks can be coerced to 'igraph' <https://CRAN.R-project.org/package=igraph> objects for further analyses and modelling. |
Authors: | Ioannis Kosmidis [aut, cre] |
Maintainer: | Ioannis Kosmidis <[email protected]> |
License: | GPL-3 |
Version: | 0.6.0 |
Built: | 2024-10-31 16:36:08 UTC |
Source: | https://github.com/ikosmidis/cranly |
cranly_network
to an igraph::graph
objectCoerce a cranly_network
to an igraph::graph
object
## S3 method for class 'cranly_network' as.igraph(x, reverse = FALSE, ...)
## S3 method for class 'cranly_network' as.igraph(x, reverse = FALSE, ...)
x |
a |
reverse |
logical. Should the direction of the edges be reversed? See details. Default is |
... |
currently not used. |
The convention for a cranly_network
object with perspective = "package"
is that the direction of an edge is from the package
that is imported by, suggested by, enhances or is a dependency of
another package, to the latter package. reverse
reverses that
direction to correctly compute relevant network summaries (see
summary.cranly_network
). reverse
is only relevant when the
attr(x, "perspective")
is "package"
and is ignored when
attr(x, "perspective")
is "author"
, in which case the resulting
igraph::graph
object represents an undirected network of
authors.
## Package directives network cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) igraph::as.igraph(package_network) ## Author collaboration network author_network <- build_network(cran_db, perspective = "author") igraph::as.igraph(author_network)
## Package directives network cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) igraph::as.igraph(package_network) ## Author collaboration network author_network <- build_network(cran_db, perspective = "author") igraph::as.igraph(author_network)
build_dependence_tree
method for an objectbuild_dependence_tree
method for an object
build_dependence_tree(x, ...)
build_dependence_tree(x, ...)
x |
an object to use for building a dependence tree |
... |
other arguments to be passed to the method |
build_network.cranly_network compute_dependence_tree
cranly_dependence_tree
objectConstruct a cranly_dependence_tree
object
## S3 method for class 'cranly_network' build_dependence_tree( x, package = Inf, base = FALSE, recommended = TRUE, global = TRUE, ... )
## S3 method for class 'cranly_network' build_dependence_tree( x, package = Inf, base = FALSE, recommended = TRUE, global = TRUE, ... )
x |
a |
package |
a vector of character strings with the package names to be matched. Default is |
base |
logical. Should we include base packages in the subset? Default is |
recommended |
logical. Should we include recommended packages in the subset? Default is |
global |
logical. If |
... |
currently not used. |
compute_dependence_tree()
plot.cranly_dependence_tree()
summary.cranly_dependence_tree()
cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) dep_tree <- build_dependence_tree(package_network, package = "PlackettLuce") plot(dep_tree)
cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) dep_tree <- build_dependence_tree(package_network, package = "PlackettLuce") plot(dep_tree)
Compute edges and nodes of package directives and collaboration networks
## S3 method for class 'cranly_db' build_network(object, trace = FALSE, perspective = "package", ...)
## S3 method for class 'cranly_db' build_network(object, trace = FALSE, perspective = "package", ...)
object |
a |
trace |
logical. Print progress information? Default is |
perspective |
character. Should a |
... |
Other arguments passed in |
The convention for a cranly_network
object with
perspective = "package"
is that the direction of an edge is
from the package that is imported by, suggested by, enhances or is
a dependency of another package, to the latter package. The author
collaboration network is analyzed and visualized as undirected by
all methods in cranly
.
A list of 2 data.frame
objects with the edges
and nodes
of the network.
clean_CRAN_db()
subset.cranly_network()
plot.cranly_network()
extractor-functions
cran_db <- clean_CRAN_db() ## Build package directives network package_network <- build_network(object = cran_db, perspective = "package") head(package_network$edges) head(package_network$nodes) attr(package_network, "timestamp") class(package_network) ## Build author collaboration network author_network <- build_network(object = cran_db, perspective = "author") head(author_network$edges) head(author_network$nodes) attr(author_network, "timestamp") class(author_network)
cran_db <- clean_CRAN_db() ## Build package directives network package_network <- build_network(object = cran_db, perspective = "package") head(package_network$edges) head(package_network$nodes) attr(package_network, "timestamp") class(package_network) ## Build author collaboration network author_network <- build_network(object = cran_db, perspective = "author") head(author_network$edges) head(author_network$nodes) attr(author_network, "timestamp") class(author_network)
tools::CRAN_package_db()
Clean and organize package and author names in the output of tools::CRAN_package_db()
clean_CRAN_db( packages_db, clean_directives = clean_up_directives, clean_author = clean_up_author, clean_maintainer = standardize_whitespace )
clean_CRAN_db( packages_db, clean_directives = clean_up_directives, clean_author = clean_up_author, clean_maintainer = standardize_whitespace )
packages_db |
a |
clean_directives |
a function that transforms the contents of
the various directives in the package descriptions to vectors
of package names. Default is |
clean_author |
a function that transforms the contents of
|
clean_maintainer |
a function that transforms the contents of
|
clean_CRAN_db()
uses clean_up_directives()
and
clean_up_author()
to clean up the author names and package names
in the various directives (like Imports
, Depends
, Suggests
,
Enhances
, LinkingTo
) as in the data.frame
that results from
tools::CRAN_package_db()
return an organized data.frame
of
class cranly_db
that can be used for further analysis.
The function tries hard to identify and eliminate mistakes in the
Author field of the description file, and extract a clean list of
only author names. The relevant operations are coded in the
clean_up_author()
function. Specifically, some references to
copyright holders had to go because they were contaminating the
list of authors (most are not necessary anyway, but that is a
different story...). The current version of clean_up_author()
is
far from best practice in using regex but it currently does a fair
job in cleaning up messy Author fields. It will be improving in
future versions.
Custom clean-up functions can also be supplied via the
clean_directives
and clean_author
arguments.
A data.frame
with the same variables as package_db
(but with
lower case names), that also inherits from class_db
, and has a
timestamp
attribute.
## Download today's CRAN package database cran_db <- tools::CRAN_package_db() ## Before clean up cran_db[cran_db$Package == "weights", "Author"] ## After clean up package_db <- clean_CRAN_db(cran_db) package_db[package_db$package == "weights", "author"]
## Download today's CRAN package database cran_db <- tools::CRAN_package_db() ## Before clean up cran_db[cran_db$Package == "weights", "Author"] ## After clean up package_db <- clean_CRAN_db(cran_db) package_db[package_db$package == "weights", "author"]
Clean up author names
clean_up_author(variable)
clean_up_author(variable)
variable |
a character string. |
A list of one vector of character strings.
clean_up_author(paste("The R Core team, Brian & with some assistance from Achim, Hadley;", "Kurt\n Portugal; Ireland; Italy; Greece; Spain"))
clean_up_author(paste("The R Core team, Brian & with some assistance from Achim, Hadley;", "Kurt\n Portugal; Ireland; Italy; Greece; Spain"))
Clean up package directives
clean_up_directives(variable)
clean_up_directives(variable)
variable |
a character string. |
A list of one vector of character strings.
clean_up_directives("R (234)\n stats (>0.01), base\n graphics")
clean_up_directives("R (234)\n stats (>0.01), base\n graphics")
Computes the dependence tree of a package
compute_dependence_tree(x, package = NULL, generation = 0)
compute_dependence_tree(x, package = NULL, generation = 0)
x |
a |
package |
a vector of character strings with the package names to be matched. If |
generation |
integer. The original generation for the package. |
Implements a recursion that computes the full dependence tree of a
package
from x
. Specifically, the packages that are
requirements for package
(Depends
, Imports
or
LinkingTo
) are found, then the requirements for those
packages are found, and so on.
build_dependence_tree.cranly_network()
Compute term frequencies from a vector of text
compute_term_frequency( txt, ignore_words = c("www.jstor.org", "www.arxiv.org", "arxiv.org", "provides", "https"), stem = FALSE, remove_punctuation = TRUE, remove_stopwords = TRUE, remove_numbers = TRUE, to_lower = TRUE, frequency = "term" )
compute_term_frequency( txt, ignore_words = c("www.jstor.org", "www.arxiv.org", "arxiv.org", "provides", "https"), stem = FALSE, remove_punctuation = TRUE, remove_stopwords = TRUE, remove_numbers = TRUE, to_lower = TRUE, frequency = "term" )
txt |
a vector of character strings. |
ignore_words |
a vector of words to be ignored when forming the corpus. |
stem |
should words be stemmed using Porter's stemming algorithm? Default is |
remove_punctuation |
should punctuation be removed when forming the corpus? Default is |
remove_stopwords |
should english stopwords be removed when forming the corpus? Default is |
remove_numbers |
should numbers be removed when forming the corpus? Default is |
to_lower |
should all terms be coerced to lower-case when forming the corpus? Default is |
frequency |
the type of term frequencies to return. Options are The operations are taking place as follows: remove special
characters, covert to lower-case (depending on the values of
|
If txt
is a named vector then the names are used as document id's
when forming the corpus.
Either a named numeric vector (frequency = "term"
), or an object of class tm::DocumentTermMatrix (frequency = "document-term"
), or or an object of class tm::TermDocumentMatrix
(frequency = "term-document"
).
cranly provides core visualizations and summaries for the CRAN package database. The package provides comprehensive methods for cleaning up and organizing the information in the CRAN package database, for building package directives networks (depends, imports, suggests, enhances, linking to) and collaboration networks, and for computing summaries and producing interactive visualizations from the resulting networks. Network visualization is through the visNetwork (https://CRAN.R-project.org/package=visNetwork) package. The package also provides functions to coerce the networks to igraph https://CRAN.R-project.org/package=igraph objects for further analyses and modelling.
Acknowledgements:
David Selby (https://selbydavid.com) experimented with and provided helpful comments and feedback on a pre-release version of cranly. His help is gratefully acknowledged.
This work has been partially supported by the Alan Turing Institute under the EPSRC grant EP/N510129/1 (Turing award number TU/B/000082).
Find packages, authors, maintainers, license, versions etc by authors, packages or names matching a specific string
## S3 method for class 'cranly_network' package_by(x, author = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' package_with(x, name = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' author_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' author_with(x, name = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' suggested_by(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' suggesting(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' imported_by(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' importing(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' dependency_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' depending_on(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' linked_by(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' linking_to(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' enhanced_by(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' enhancing(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' maintainer_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' maintained_by(x, author = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' email_of(x, author = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' email_with(x, name = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' description_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' title_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' license_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' version_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' release_date_of(x, package = NULL, exact = FALSE, flat = TRUE) packages_by(x, author, exact, flat) packages_with(x, name = NULL, exact = FALSE, flat = TRUE) authors_with(x, name = NULL, exact = FALSE, flat = TRUE) authors_of(x, package = NULL, exact = FALSE, flat = TRUE) emails_of(x, author = NULL, exact = FALSE, flat = TRUE) emails_with(x, name = NULL, exact = FALSE, flat = TRUE) descriptions_of(x, package = NULL, exact = FALSE, flat = TRUE) titles_of(x, package = NULL, exact = FALSE, flat = TRUE) licenses_of(x, package = NULL, exact = FALSE, flat = TRUE) release_dates_of(x, package = NULL, exact = FALSE, flat = TRUE) versions_of(x, package = NULL, exact = FALSE, flat = TRUE)
## S3 method for class 'cranly_network' package_by(x, author = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' package_with(x, name = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' author_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' author_with(x, name = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' suggested_by(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' suggesting(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' imported_by(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' importing(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' dependency_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' depending_on(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' linked_by(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' linking_to(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' enhanced_by(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' enhancing(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' maintainer_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' maintained_by(x, author = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' email_of(x, author = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' email_with(x, name = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' description_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' title_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' license_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' version_of(x, package = NULL, exact = FALSE, flat = TRUE) ## S3 method for class 'cranly_network' release_date_of(x, package = NULL, exact = FALSE, flat = TRUE) packages_by(x, author, exact, flat) packages_with(x, name = NULL, exact = FALSE, flat = TRUE) authors_with(x, name = NULL, exact = FALSE, flat = TRUE) authors_of(x, package = NULL, exact = FALSE, flat = TRUE) emails_of(x, author = NULL, exact = FALSE, flat = TRUE) emails_with(x, name = NULL, exact = FALSE, flat = TRUE) descriptions_of(x, package = NULL, exact = FALSE, flat = TRUE) titles_of(x, package = NULL, exact = FALSE, flat = TRUE) licenses_of(x, package = NULL, exact = FALSE, flat = TRUE) release_dates_of(x, package = NULL, exact = FALSE, flat = TRUE) versions_of(x, package = NULL, exact = FALSE, flat = TRUE)
x |
a |
author |
a vector of character strings with the author names to be matched. If |
exact |
logical. Should we use exact matching? Default is |
flat |
if |
name |
a vector of character strings with the names to be matched. If |
package |
a vector of character strings with the package names to be matched. If |
The extractor functions all try to figure out what y
is in the statement
y
is (the) extractor-function
a package
/author
.
For example, for
"y
is the package by "Kurt Hornik"
" we do package_by(x, "Kurt Hornik")
"y
is the author of a package with a name matching "MASS"
" we do author_of(x, "MASS")
"y
is the package enhanced by the "prediction"
package we do enhanced_by(x, "prediction", exact = TRUE)
"y
is the package linking to "Rcpp"
we do linking_to(x, "Rcpp", exact = TRUE)
If flat = TRUE
then the result of the extraction function is a
data.frame
, which is the subset of x$nodes
matching author
,
name
or package
(according to the value of exact
). If flat = FALSE
then the results is a vector.
When flat = TRUE
any NA
s are removed before the result is
returned.
build_network.cranly_db()
subset.cranly_network()
plot.cranly_network()
# Using a package directives network cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) ## Find all packages containing glm in their name package_with(package_network, name = "glm") ## Find all authors of packages containing brglm in their name author_of(package_network, package = "rglm", exact = FALSE) ## Find all packages with brglm in their name package_with(package_network, name = "rglm", exact = FALSE) ## Find all authors of the package brglm2 author_of(package_network, package = "brglm2", exact = TRUE) ## Find all authors with Ioannis in their name author_with(package_network, name = "Ioannis", exact = FALSE) ## Find all packages suggested by Rcpp suggested_by(package_network, package = "Rcpp", exact = TRUE) ## Find all packages imported by Rcpp imported_by(package_network, package = "Rcpp", exact = TRUE) ## Find all packages enhacing brglm enhancing(package_network, package = "brglm", exact = TRUE) ## Find all packages linking to RcppArmadillo linking_to(package_network, package = "RcppArmadillo", exact = TRUE) ## Find the release date of RcppArmadillo release_date_of(package_network, package = "RcppArmadillo", exact = TRUE) ## Find the release data of all packages with "brglm" in their name release_date_of(package_network, package = "brglm", exact = FALSE) ## More information about packages with "brglm" in their name release_date_of(package_network, package = "brglm", exact = FALSE, flat = FALSE)[c("package", "version")] ## Using an author collaboration network author_network <- build_network(cran_db, perspective = "author") ## Find all packages containing glm in their name package_with(author_network, name = "glm") ## Find all authors of packages containing brglm in their name author_of(author_network, package = "rglm", exact = FALSE) ## Find all packages with brglm in their name package_with(author_network, name = "rglm", exact = FALSE) ## Find all authors of the package brglm2 author_of(author_network, package = "brglm2", exact = TRUE) ## Find all authors with Ioannis in their name author_with(author_network, name = "Ioannis", exact = FALSE)
# Using a package directives network cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) ## Find all packages containing glm in their name package_with(package_network, name = "glm") ## Find all authors of packages containing brglm in their name author_of(package_network, package = "rglm", exact = FALSE) ## Find all packages with brglm in their name package_with(package_network, name = "rglm", exact = FALSE) ## Find all authors of the package brglm2 author_of(package_network, package = "brglm2", exact = TRUE) ## Find all authors with Ioannis in their name author_with(package_network, name = "Ioannis", exact = FALSE) ## Find all packages suggested by Rcpp suggested_by(package_network, package = "Rcpp", exact = TRUE) ## Find all packages imported by Rcpp imported_by(package_network, package = "Rcpp", exact = TRUE) ## Find all packages enhacing brglm enhancing(package_network, package = "brglm", exact = TRUE) ## Find all packages linking to RcppArmadillo linking_to(package_network, package = "RcppArmadillo", exact = TRUE) ## Find the release date of RcppArmadillo release_date_of(package_network, package = "RcppArmadillo", exact = TRUE) ## Find the release data of all packages with "brglm" in their name release_date_of(package_network, package = "brglm", exact = FALSE) ## More information about packages with "brglm" in their name release_date_of(package_network, package = "brglm", exact = FALSE, flat = FALSE)[c("package", "version")] ## Using an author collaboration network author_network <- build_network(cran_db, perspective = "author") ## Find all packages containing glm in their name package_with(author_network, name = "glm") ## Find all authors of packages containing brglm in their name author_of(author_network, package = "rglm", exact = FALSE) ## Find all packages with brglm in their name package_with(author_network, name = "rglm", exact = FALSE) ## Find all authors of the package brglm2 author_of(author_network, package = "brglm2", exact = TRUE) ## Find all authors with Ioannis in their name author_with(author_network, name = "Ioannis", exact = FALSE)
cranly_network
Interactive visualization of package(s) dependence tree from a cranly_network
## S3 method for class 'cranly_dependence_tree' plot( x, physics_threshold = 200, height = NULL, width = NULL, dragNodes = TRUE, dragView = TRUE, zoomView = TRUE, legend = TRUE, title = TRUE, plot = TRUE, ... )
## S3 method for class 'cranly_dependence_tree' plot( x, physics_threshold = 200, height = NULL, width = NULL, dragNodes = TRUE, dragView = TRUE, zoomView = TRUE, legend = TRUE, title = TRUE, plot = TRUE, ... )
x |
a |
physics_threshold |
integer. How many nodes before switching off physics simulations for edges? Default is |
height |
: Height (optional, defaults to automatic sizing) |
width |
: Width (optional, defaults to automatic sizing) |
dragNodes |
logical. Should the user be able to drag the nodes that are not fixed? Default is |
dragView |
logical. Should the user be able to drag the view around? Default is |
zoomView |
logical. Should the user be able to zoom in? Default is |
legend |
logical. Should a legend be added on the resulting visualization? Default is |
title |
logical. Should a title be added on the resulting visualization? Default is |
plot |
logical. Should the visualization be returned? Default is |
... |
currently not used. |
compute_dependence_tree()
build_dependence_tree.cranly_network()
cranly_network
Interactive visualization of a package or author cranly_network
## S3 method for class 'cranly_network' plot( x, package = Inf, author = Inf, directive = c("imports", "suggests", "enhances", "depends", "linking_to"), base = TRUE, recommended = TRUE, exact = TRUE, global = TRUE, physics_threshold = 200, height = NULL, width = NULL, dragNodes = TRUE, dragView = TRUE, zoomView = TRUE, legend = TRUE, title = TRUE, plot = TRUE, ... )
## S3 method for class 'cranly_network' plot( x, package = Inf, author = Inf, directive = c("imports", "suggests", "enhances", "depends", "linking_to"), base = TRUE, recommended = TRUE, exact = TRUE, global = TRUE, physics_threshold = 200, height = NULL, width = NULL, dragNodes = TRUE, dragView = TRUE, zoomView = TRUE, legend = TRUE, title = TRUE, plot = TRUE, ... )
x |
a |
package |
a vector of character strings with the package names to be matched. Default is |
author |
a vector of character strings with the author names to be matched. Default is |
directive |
a vector of at least one of |
base |
logical. Should we include base packages in the subset? Default is |
recommended |
logical. Should we include recommended packages in the subset? Default is |
exact |
logical. Should we use exact matching? Default is |
global |
logical. If |
physics_threshold |
integer. How many nodes before switching off physics simulations for edges? Default is |
height |
: Height (optional, defaults to automatic sizing) |
width |
: Width (optional, defaults to automatic sizing) |
dragNodes |
logical. Should the user be able to drag the nodes that are not fixed? Default is |
dragView |
logical. Should the user be able to drag the view around? Default is |
zoomView |
logical. Should the user be able to zoom in? Default is |
legend |
logical. Should a legend be added on the resulting visualization? Default is |
title |
logical. Should a title be added on the resulting visualization? Default is |
plot |
logical. Should the visualization be returned? Default is |
... |
currently not used. |
cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) ## The package directives network of all users with Ioannis in ## their name from the CRAN database subset crandb plot(package_network, author = "Ioannis", exact = FALSE) ## The package directives network of "Achim Zeileis" plot(package_network, author = "Achim Zeileis") author_network <- build_network(cran_db, perspective = "author") plot(author_network, author = "Ioannis", exact = FALSE, title = TRUE)
cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) ## The package directives network of all users with Ioannis in ## their name from the CRAN database subset crandb plot(package_network, author = "Ioannis", exact = FALSE) ## The package directives network of "Achim Zeileis" plot(package_network, author = "Achim Zeileis") author_network <- build_network(cran_db, perspective = "author") plot(author_network, author = "Ioannis", exact = FALSE, title = TRUE)
Top-n package or author barplots according to a range of network statistics
## S3 method for class 'summary_cranly_network' plot(x, top = 20, according_to = NULL, scale = FALSE, ...)
## S3 method for class 'summary_cranly_network' plot(x, top = 20, according_to = NULL, scale = FALSE, ...)
x |
a |
top |
integer. How may top packages or authors should be plotted? Default is |
according_to |
the statistic according to which the top- |
scale |
logical. Should the statistics be scaled to lie between |
... |
currently not used |
cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) package_summaries <- summary(package_network) plot(package_summaries, according_to = "n_imported_by", top = 30) plot(package_summaries, according_to = "n_depended_by", top = 30) plot(package_summaries, according_to = "page_rank", top = 30) ## author network author_network <- build_network(cran_db, perspective = "author") author_summaries <- summary(author_network) plot(author_summaries, according_to = "n_collaborators", top = 30) plot(author_summaries, according_to = "n_packages", top = 30) plot(author_summaries, according_to = "page_rank", top = 30)
cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) package_summaries <- summary(package_network) plot(package_summaries, according_to = "n_imported_by", top = 30) plot(package_summaries, according_to = "n_depended_by", top = 30) plot(package_summaries, according_to = "page_rank", top = 30) ## author network author_network <- build_network(cran_db, perspective = "author") author_summaries <- summary(author_network) plot(author_summaries, according_to = "n_collaborators", top = 30) plot(author_summaries, according_to = "n_packages", top = 30) plot(author_summaries, according_to = "page_rank", top = 30)
Standardize whitespace in strings
standardize_whitespace(variable)
standardize_whitespace(variable)
variable |
a character string. |
A list of one vector of character strings.
standardize_whitespace(" My spacebar key is broken. ")
standardize_whitespace(" My spacebar key is broken. ")
cranly_network
according to author, package and/or directiveSubset a cranly_network
according to author, package and/or directive
## S3 method for class 'cranly_network' subset( x, package = Inf, author = Inf, maintainer = Inf, directive = c("imports", "suggests", "enhances", "depends", "linking_to"), base = TRUE, recommended = TRUE, exact = TRUE, only = FALSE, ... )
## S3 method for class 'cranly_network' subset( x, package = Inf, author = Inf, maintainer = Inf, directive = c("imports", "suggests", "enhances", "depends", "linking_to"), base = TRUE, recommended = TRUE, exact = TRUE, only = FALSE, ... )
x |
a |
package |
a vector of character strings with the package names to be matched. Default is |
author |
a vector of character strings with the author names to be matched. Default is |
maintainer |
a vector of character strings with the maintainer names to be matched. Default is |
directive |
a vector of at least one of |
base |
logical. Should we include base packages in the subset? Default is |
recommended |
logical. Should we include recommended packages in the subset? Default is |
exact |
logical. Should we use exact matching? Default is |
only |
logical. If |
... |
currently not used. |
A cranly_network
object that is the subject of x
.
cranly_dependence_tree
objectsHard dependence summaries for R packages from a cranly_dependence_tree
object
## S3 method for class 'cranly_dependence_tree' summary(object, ...)
## S3 method for class 'cranly_dependence_tree' summary(object, ...)
object |
a |
... |
currently not used. |
The summary method for a cranly_dependence_tree
object returns
the number of generations the R package(s) in the object inherit
from (n_generations
), the immediate parents of the R package(s)
(parents
), and a dependence index dependence_index
defined as
where is the dependence tree for the package(s)
,
is the total number of packages that depend, link or
import package
, and
is the generation that
package
appears in the dependence tree of package(s)
. The generation takes values on the non-positive integers,
with the package(s)
being placed at generation
0
, the
packages that links to, depends or imports at generation
-1
and so on.
A dependence index of zero means that the only has
immediate parents. The dependence index weights the dependencies
based on how popular these are, in the sense that the index is
not penalized if the package depends on popular packages. The
greatest the dependence index is the more baggage the package
carries, and the maintainers may want to remove any dependencies
that are not necessary.
A list with components n_generations
, parents
, and dependence_index
.
build_dependence_tree.cranly_network()
compute_dependence_tree()
cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) ## Two light packages dep_tree <- build_dependence_tree(package_network, package = "brglm") summary(dep_tree) dep_tree <- build_dependence_tree(package_network, package = "gnm") summary(dep_tree) ## A somewhat heavier package (sorry)... dep_tree <- build_dependence_tree(package_network, package = "cranly") summary(dep_tree)
cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) ## Two light packages dep_tree <- build_dependence_tree(package_network, package = "brglm") summary(dep_tree) dep_tree <- build_dependence_tree(package_network, package = "gnm") summary(dep_tree) ## A somewhat heavier package (sorry)... dep_tree <- build_dependence_tree(package_network, package = "cranly") summary(dep_tree)
Compute a range of package directives and collaboration network statistics
## S3 method for class 'cranly_network' summary(object, advanced = TRUE, ...)
## S3 method for class 'cranly_network' summary(object, advanced = TRUE, ...)
object |
a |
advanced |
logical. If |
... |
currently not used' |
If attr(object, "perspective")
is "package"
then the
resulting data.frame
will have the following variables:
package. package name
n_authors (basic). number of authors for the package
n_imports (basic). number of packages the package imports
n_imported_by (basic). number of times the package is imported by other packages
n_suggests (basic). number of packages the package suggests
n_suggested_by (basic). number of times the package is suggested by other packages
n_depends (basic). number of packages the package depends on
n_depended_by (basic). number of packages that have the package as a dependency
n_enhances (basic). number of packages the package enhances
n_enhanced_by (basic). number of packages the package is enhanced by
n_linking_to (basic). number of packages the package links to
n_linked_by (basic). number of packages the package is linked by
betweenness (advanced). the package betweenness in the package network; as computed by igraph::betweenness()
closeness (advanced). the closeness centrality of the package in the package network; as computed by igraph::closeness()
page_rank (advanced). the Google PageRank of the package in the package network; as computed by igraph::page_rank()
degree (advanced). the degree of the package in the package network; as computed by igraph::degree()
eigen_centrality (advanced). the eigenvector centrality score of the package in the package network; as computed by igraph::eigen_centrality()
If attr(object, "perspective")
is "author"
then the
resulting data.frame
will have the following variables:
author. author name
n_packages (basic). number of packages the author appears in the package authors
n_collaborators (basic). total number of co-authors the author has in CRAN
betweenness (advanced). the author betweenness in the author network; as computed by igraph::betweenness()
closeness (advanced). the closeness centrality of the author in the author network; as computed by igraph::closeness()
page_rank (advanced). the Google PageRank of the author in the author network; as computed by igraph::page_rank()
degree (advanced). the degree of the author in the author network; as computed by igraph::degree()
; same as n_collaborators
eigen_centrality (advanced). the eigenvector centrality score of the author in the author network; as computed by igraph::eigen_centrality()
A data.frame
of various statistics for the author collaboration
network or the package directives network, depending on whether
attr(object, "perspective")
is "author"
or "package"
,
respectively. See Details for the current list of statistics
returned.
wordcloud of author names, package descriptions, and package titles
## S3 method for class 'cranly_network' word_cloud( x, package = Inf, author = Inf, maintainer = Inf, base = TRUE, recommended = TRUE, exact = TRUE, perspective = "description", random_order = FALSE, ignore_words = c("www.jstor.org", "www.arxiv.org", "arxiv.org", "provides", "https"), stem = FALSE, colors = rev(colorspace::heat_hcl(10)), ... ) ## S3 method for class 'numeric' word_cloud( x, random_order = FALSE, colors = rev(colorspace::heat_hcl(10)), ... )
## S3 method for class 'cranly_network' word_cloud( x, package = Inf, author = Inf, maintainer = Inf, base = TRUE, recommended = TRUE, exact = TRUE, perspective = "description", random_order = FALSE, ignore_words = c("www.jstor.org", "www.arxiv.org", "arxiv.org", "provides", "https"), stem = FALSE, colors = rev(colorspace::heat_hcl(10)), ... ) ## S3 method for class 'numeric' word_cloud( x, random_order = FALSE, colors = rev(colorspace::heat_hcl(10)), ... )
x |
either a |
package |
a vector of character strings with the package names to be matched. Default is |
author |
a vector of character strings with the author names to be matched. Default is |
maintainer |
a vector of character strings with the maintainer names to be matched. Default is |
base |
logical. Should we include base packages in the subset? Default is |
recommended |
logical. Should we include recommended packages in the subset? Default is |
exact |
logical. Should we use exact matching? Default is |
perspective |
should the wordcloud be that of package descriptions ( |
random_order |
should words be plotted in random order? If |
ignore_words |
a vector of words to be ignored when forming the corpus. |
stem |
should words be stemmed using Porter's stemming algorithm? Default is |
colors |
color words from least to most frequent |
... |
other arguments to be passed to wordcloud::wordcloud (except |
When applied to cranly_network
objects, word_cloud()
subsets
either according to author
(using the intersection of the result
of author_of()
and author_with()
) or according to package
(using the intersection of the results of package_with()
and
package_by()
).
For handling more complex queries, one can manually extract the #'
term frequencies from a supplied vector of character strings (see
compute_term_frequency()
), and use word_cloud()
on them. See the
examples.
A word cloud.
## Package directives network cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) ## Descriptions of all packages in tidyverse tidyverse <- imported_by(package_network, "tidyverse", exact = TRUE) set.seed(123) word_cloud(package_network, package = tidyverse, exact = TRUE, min.freq = 2) ## or by manually creating the term frequencies from descriptions descriptions <- descriptions_of(package_network, tidyverse, exact = TRUE) term_freq <- compute_term_frequency(descriptions) set.seed(123) word_cloud(term_freq, min.freq = 2)
## Package directives network cran_db <- clean_CRAN_db() package_network <- build_network(cran_db) ## Descriptions of all packages in tidyverse tidyverse <- imported_by(package_network, "tidyverse", exact = TRUE) set.seed(123) word_cloud(package_network, package = tidyverse, exact = TRUE, min.freq = 2) ## or by manually creating the term frequencies from descriptions descriptions <- descriptions_of(package_network, tidyverse, exact = TRUE) term_freq <- compute_term_frequency(descriptions) set.seed(123) word_cloud(term_freq, min.freq = 2)