tpNotes.jl

<!–- Stable –-> Dev Build Status Coverage

Purpose

This repo tracks my notes on things I learn in Julia.

Vocabulary

Standard idioms and their R equivalent.

Some R function names are implemented in src/r.jl

Vectors

Julia

julia> extrema([1,2,3])
(1, 3)

julia> first(collect(1:10))
1

julia> last(collect(1:10))
10

R

> range(c(1,2,3))
[1] 1 3
> head(1:10,1)
[1] 1
> tail(1:10,1)
[1] 10

Combinations

Julia

julia> using DataFrames
julia> rename(DataFrame(Base.Iterators.product([1,2], ["A","B","C"])), ["Num","Name"])
6×2 DataFrame
 Row │ Num    Name   
     │ Int64  String 
─────┼───────────────
   1 │     1  A
   2 │     2  A
   3 │     1  B
   4 │     2  B
   5 │     1  C
   6 │     2  C

julia> vec(string.(["x", "y"], [1 2 3])) ## note: col-vector, row-vector. Ref https://github.com/JuliaAcademy/DataFrames/blob/main/2.%20First%20steps%20with%20data%20frames.ipynb
6-element Array{String,1}:
 "x1"
 "y1"
 "x2"
 "y2"
 "x3"
 "y3"

R

setNames(expand.grid(c(1,2), c("A","B","C")), c("Num","Name"))
  Num Name
1   1    A
2   2    A
3   1    B
4   2    B
5   1    C
6   2    C

paste

The R function paste does 2 things: 1. paste the elements of two string vectors together element by element; 2: collapse a string vector into a single string.

Combining

R:

paste(c("a", "b"), c(1,2))
[1] "a 1" "b 2"
paste(c("a", "b"), c(1,2), sep="")
[1] "a1" "b2"
julia> string.(["a", "b"], " ", [1,2])
2-element Vector{String}:
 "a 1"
 "b 2"

julia> string.(["a", "b"], [1,2])
2-element Vector{String}:
 "a1"
 "b2"

Collapse

R

> paste(c("a", "b", "c"), collapse="")
[1] "abc"
> paste(c("a", "b", "c"), collapse=", ")
[1] "a, b, c"

Julia:

julia> join(["a", "b", "c"])
"abc"

julia> join(["a", "b", "c"], ", ")
"a, b, c"
julia> join(["a", "b", "c"], ", ", " and ")
"a, b and c"

Loop over rows of data frame

Julia

using DataFrames, DataFramesMeta

df_in = rename(DataFrame(Base.Iterators.product([1,2], ["A","B","C"])), ["Num","Name"]);

@eachrow df_in begin 
  @newcol Res1::Vector{String}
  @newcol Res2::Vector{String}
  :Res1 = string(:Num) * :Name
  :Res2 = :Name * string(:Num)
end

6×4 DataFrame
 Row │ Num    Name    Res1    Res2   
     │ Int64  String  String  String 
─────┼───────────────────────────────
   1 │     1  A       1A      A1
   2 │     2  A       2A      A2
   3 │     1  B       1B      B1
   4 │     2  B       2B      B2
   5 │     1  C       1C      C1
   6 │     2  C       2C      C2

## Simpler for this case:
@transform(df_in, Res1 = string.(:Num) .* :Name, Res2 = :Name .* string.(:Num))

The last form is much simpler, but only works if the function can be broardcasted (afaIk).

Lists and dicts

check if key is in dict

julia> haskey(args,"fit_file")
true
> "fit_file" %in% names(args)

Collect instances into a dict

See https://discourse.julialang.org/t/collect-values-in-a-dict/64626

julia> l = [x => x%3 for x in 1:10];
julia> d = Dict{Int, Vector{Int}}()
julia> for (x,y) in l
julia>   push!(get!(Vector{Int},d,y), x) 
julia> end
julia> d

Dict{Int64, Vector{Int64}} with 3 entries:
  0 => [3, 6, 9]
  2 => [2, 5, 8]
  1 => [1, 4, 7, 10]

Similar in R (not explicit pairs):

> l <- (1:10)%%3
> setNames(lapply(unique(l), function(x) which(l == x)), unique(l))
`1`
[1]  1  4  7 10

$`2`
[1] 2 5 8

$`0`
[1] 3 6 9

Development

PkgTemplates

This repo was setup using this snippet:

julia> using PkgTemplates
julia> t = Template(; 
    user="tp2750",
    dir=".",
    authors="Thomas Poulsen",
    julia=v"1.6",
    plugins=[
        License(; name="GPL-2.0+"),
        Git(; manifest=false, ssh=true),
        GitHubActions(;extra_versions=["nightly"], x86=false, windows=false, osx=false), ## skip some defaults
        Codecov(),
        Documenter{GitHubActions}(),
        Develop(),
    ],
  )
julia> t("tpNotes")

Created the repo "tpNotes.jl" in GitHub and just did:

tpNotes$ git push --set-upstream origin master

Note that the project name in PkgTemplates doe not include ".jl", but the repo-name does.

After a bit the "CI" and "codecov" badges turn green. But the "docs" badges do not work out of the box.

Using ssh

Changing to use ssh. First check current with git remote -v. Then change it with git remote set-url origin ...:

tpNotes$ git remote -v 
origin	https://github.com/tp2750/tpNotes.jl (fetch)
origin	https://github.com/tp2750/tpNotes.jl (push)

tpNotes$ git remote set-url origin  git@github.com:tp2750/tpNotes.jl.git

Remember the .git at the end.

Documentation using Documenter.jl

Modules needed in documentation needs to be loaded in the make.jl file. This is also the place to control the sidebar (in the pages = [] argument to makedocs). It is good practice to split documentation in several files. See https://juliadocs.github.io/Documenter.jl/stable/man/guide/#Pages-in-the-Sidebar

Adding keys for Documenter and Github Actions

For documentation to automatically build, generate keys by running DocumenterTools.genkeys and follow the instructions.

OBS Remeber to set the proper workflow permissions: "horizontal menu: Settings -> vertical menu: Actions -> General -> section: Workflow permissions: choose "Read and write permissions" and check the box: "Allow GiHub Actions to create and approve pull requests". This is not needed to build documentation, but it is needed for compatHelper to create pull requests when dependencies need to be updated.

(tpNotes) pkg> add DocumenterTools
julia> using tpNotes
julia> using DocumenterTools
julia> DocumenterTools.genkeys(user = "tp2750", repo="tpNotes.jl")

Name the public key (deploy key) "DOCUMENTERPUB" and the private key (repository secret under Settings -> Secrets and variables -> Actions -> Repository secret) "DOCUMENTERKEY"-

Building the docs

to build the docs, cd to the docs folder, and jun make.jl in the context of the docs project:

tpNotes.jl/docs$ julia --project=. make.jl 

In github, set github-pages to build from the branch: gh-pages in the / (root) folder.

Examples

Code examples in documentation files can share context if they are named. The documentation does not mention it, but it looks like named blocks have to be continuous (two blocks can not mix).

Eg:

<pre>

  a = 3
3
  print(a)
3

</pre>

Overloading Base operator

Overloading a base binary operator (like +):

  • Define my own struct.
  • Define a method of a base function using that struct. Use symbol notation for the operator.

No import or export needed.

Example

struct People
    name::String
end

Base.:+(p1::People, p2::People) = "$(string(p1.name)) and $(string(p2.name))"

Then we have

julia> using tpNotes
julia> p1 = tpNotes.People("Søren")
julia> p2 = tpNotes.People("Mette")
julia> p1 + p2 == "Søren and Mette"
true

Code coverage

Computing code coverage locally is done as described in the here (in the README of the Coverage.jl package).

PackageCompile

Remember to "dev" the local module from the "app" module. If not, you need to re-add every time you make changes to the actual module.

Misc

Which project is active?

Base.active_project()

K-means

  • K-means

https://juliastats.org/Clustering.jl/dev/kmeans.html

Elements to cluster are in columns: (use x')

using Clustering
julia> x=vcat(0,repeat(1:1,10))
julia> res = kmeans(x',2) ## or kmeans(reshape(x, 1,11),2) 
julia> res.centers
1×2 Matrix{Float64}:
 1.0  0.0
julia> assignments(res)
11-element Vector{Int64}:
 2
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1

K-medoids

Selects a representing point.

Needs distance matrix.

julia> using Distances
julia> x_dist = pairwise(Euclidean(), x'; dims=2)
# or simply (https://discourse.julialang.org/t/pairwise-distances-from-a-single-column-or-vector/29415/6)
julia> x_dist = [abs(i-j) for i in x, j in x]

julia> res2 = kmedoids(x_dist, 2)
julia> res2.medoids ## indices of medoid points
2-element Vector{Int64}:
 2
 1
# medoid points:
julia> x[res2.medoids]
2-element Vector{Int64}:
 1
 0
julia> assignments(res2)
11-element Vector{Int64}:
 2
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 ```
 
# Conversions

## matrix to vector and back

julia> m = [1 3 5; 2 4 6] 2×3 Matrix{Int64}: 1 3 5 2 4 6

julia> vec(m) 6-element Vector{Int64}: 1 2 3 4 5 6

julia> reshape(vec(m), size(m)) == m true



# Tests
## Comparing at stated precision
https://discourse.julialang.org/t/compare-numbers-at-the-stated-precision/86719/10
In tests I’ll like to be able to write the following and the test to pass:

julia @test aresame(pi, 3.14)


The following works:

julia @test isapprox(pi, 3.14, atol = 0.005)


But then I need to adjust the absolute tolerance manually.

I need a function to find the “number of significant digits” of a numeric literal and use that.

The following apparently works, but I’m wondering if this already exists?

julia function sigdigs(x) xstring = string(convert(Float64,x)) length(xstring) - findlast('.',x_string) end

function aresame(x,y) toldigits = min(sigdigs(x), sigdigs(y)) tol = .49*0.1^(toldigits) isapprox(x,y,atol=tol) end

julia> aresame(pi,3.1) true

julia> aresame(pi,3.14) true

julia> aresame(pi,3.141) false

julia> aresame(pi,3.1415) false


# Packages
Cool and useful packages

## [TerminalPager.jl](https://github.com/ronisbr/TerminalPager.jl)
Great for browsing large tables:

using TerminalPager, DataFrames pager(DataFrame(rand(100, 100), :auto))


Press ? to get navigation help:

* Shift ->, Shift <- to move side-wise
* u/d to move up/down by half a page or
  Page-up/down for full page
* < or g to go to top
* > or G to go to end

## [DefaultApplication.jl](https://github.com/tpapp/DefaultApplication.jl)
Basically just calling `xdg-open`, but still useful.

# Base functions

* `repr` Create a string from any value using the show function.
* replace(string, pattern => replacement; [count])

# Reuse installed version of packages
From julia 1.9 we can change the default package installation strategy to Pkg.PRESERVE_TIERED_INSTALLED to let the package manager try to install versions of packages while keeping as many versions of packages already installed as possible:

julia ENV["JULIAPKGPRESERVETIEREDINSTALLED"] = true


See https://docs.julialang.org/en/v1/manual/environment-variables/#JULIA_PKG_PRESERVE_TIERED_INSTALLED,
https://pkgdocs.julialang.org/v1/api/#Pkg.add

From doc of Pkg.add:

Pkg resolves the set of packages in your environment using a tiered algorithm. The preserve keyword argument allows you to key into a specific tier in the resolve algorithm. The following table describes the argument values for preserve (in order of strictness):

| Value | Description |
| -- | -- |
| PRESERVE_ALL_INSTALLED |  Like PRESERVE_ALL and only add those already installed |
| PRESERVE_ALL | Preserve the state of all existing dependencies (including recursive dependencies) |
| PRESERVE_DIRECT | Preserve the state of all existing direct dependencies |
| PRESERVE_SEMVER | Preserve semver-compatible versions of direct dependencies |
| PRESERVE_NONE | Do not attempt to preserve any version information |
| PRESERVE_TIERED_INSTALLED | Like PRESERVE_TIERED except PRESERVE_ALL_INSTALLED is tried first |
| PRESERVE_TIERED | Use the tier that will preserve the most version information while allowing version resolution to succeed (this is the default) | 
	

# startup.jl

My current `.julia/config/startup.jl` file

julia ENV["JULIA_EDITOR"] = "emacs"

ENV["JULIANUMTHREADS"]=3,1

using Revise #using Infiltrator #using DrWatson

using PkgTemplates

ENV["pkgtemplate"] = """Template(;julia=v"1.10",user="tp2750",dir = ".", plugins=[Git(; manifest=false, ssh=true),Documenter{GitHubActions}(),GitHubActions(extraversions= ["1.10", "1.11")])""" @info """To start a package, do:\nusing PkgTemplates\nt = eval(Meta.parse(ENV["pkg_template"]))\nt("MyPackage")"""

ENV["JULIAPKGPRESERVETIEREDINSTALLED"] = true

@info "PRESERVETIEREDINSTALLED set by JULIAPKGPRESERVETIEREDINSTALLED"

@info "To reset to default do: ENV[\"JULIAPKGPRESERVETIEREDINSTALLED\"] = true"

@info "To prefer already installed versions of libraries, set ENV[\"JULIAPKGPRESERVETIEREDINSTALLED\"] = true\n Undo by ENV[\"JULIAPKGPRESERVETIEREDINSTALLED\"] = true"

https://twitter.com/heyjoshday/status/1555527185028395010

macro dev()

pkg = Symbol(

replace(

readline("Project.toml"),

"name = \"" => "",

'"' => ""

)

)

esc(:(using Pkg; Pkg.activate("."); using Revise, pkg))

end

## https://discourse.julialang.org/t/what-is-in-your-startup-jl/18228/2?u=tp2750

cdpkg(pkg) = cd(dirname(Base.find_package(string(pkg))))

macro cdpkg(pkg)

cdpkg(pkg)

return nothing

end

```