OpenDreamKit Project Review, Tue 2018-10-30

OpenDreamKit Project Review

Tuesday 30 October 2018

See Agenda.

Reviewer feedback and unanswered questions

R1

Suggestion: use VRE to make digital artifacts citable

Is there collaboration with CoCalc?

MitM: make a comic about it

specifically on D6.5,6.8, but applicable in general: include informal KPIs and success stories (e.g., uptake by other communities) in the deliverables

R2

What is our position on CoCalc, and its closed architecture? Suggestion (Cremona): invite Bill Stein to next review

MathHub, 4.11: find outside evaluation, users

Make conclusion of D4.12 (or was it a diferent 4.x?) more precise, maybe add guidelines

D5.11: Add explanatory paragraph to table on speedups. (Critical remark: alreadyy told us last time)

R3

Geographical distribution of users, eg Eastern Europe

###

Deliverable 5.11:

Include success stories (like logipedia) in the Deliverable and Technical Report

Comic about Math in the Middle

09:00 Welcome

R3. How do you measure impact and adoption?

Nicolas. See the presentations, but please ask again if needed

End: 09:55

09:30 Some context: case studies, impact

Science and Open Source: what do we learn from each other?

Talk by Viviane Pons

Start: 09:56

Q (R1): will you tell more about sustainability for ODK

Q (R1): how many RSEs vs researchers do you have

R2. You’re in love with SageMath REading all the deliverables this year, I get this conflicts with your love of open-science. Maybe SageMath is not the most open environment, the way you want to go in the future.

Nicolas: make it more friendly, diverse, make it easier to contribute.

R2. Feeling when reading about SageMathCloud becoming CoCalc.

Is there a confusion SageMath <-> SageMathCloud? Viviane explains one of the reasons to rename the latter to CoCalc.

Nicolas: what we’re doing is in a way make things modular and making sure someone could do a CoCalc from modular pieces.

End of questions: 10:33

Case study: Micromagnetics

Talk by Beg Marijan (Start: 10:34)

Nicolas: our proposal claimed our main target is mathematics but it would have impact outside mathematics.

Marijan will present micromagnetics.

Q(R1): can you do this in cocalc? If so what is the added value of ODK work on this?

A (Nicolas): the ODK work here was enabling the technology. Most of the effort is to connect OOMMF to Jupyter. From there on, it can be used either in JupyterHub, in CoCalc, or other.

Q(R2): I can’t install it A(Marijan): ok we’ll have a look. Comment(R2): I will have physics question too.

Q(R3): what kind of conferences? A: usually choose three biggest conferences in the field in a year

End: 11:03

Q(R1): how do you cite this artefact or whatever JOOMMF is. If you want somebody to reproduce your work how do you cite it.

Some journals allow to submit supplementary material

If not we can create a github repo for each paper which has the supplementary material

Comment(R1): I find this is an interesting aspect. Marijan: binder link lets you get all those notebooks live and play with them. Some journals are not happy with notebook links; the whole culture needs to be changed.

10:20 Coffee break, with demonstrators and posters

Demos stay in original room

Agreed: this seems to be the most reasonable option

11:20 Achievements, by work package, session 1

In new room; start 11:45.

WP4: User Interfaces

Lead: Vidar Fauske filling in for Benjamin Ragan-Kelley

Start: 11:47

We did:

Kernels: 49 -> 117, and 6 contributions from ODK: Cling, GAP, MMT, PARI/GP, SageMath, Singular

Notebooks found on Github for different kernels: - SageMath - ~6.2k

Silent comment (Samuel): would be nice to also show stats for some non-ODK kernels.

Highlights:

LIVE DEMO!!!

Work to improve distribution

R2. At the end of D4.12 “it’s not necessarily apparent which package or combination of packages” Are you or will you be in a position to make recommendations. At the moment it seems a little bit inconclusive.

A: trying to make as modular as possible

ipyvolume + scivijs “might include”

I get the impression you don’t know what you want to do, and I have to judge this deliverable.

A (Michael): toolkit approach at work: flexibly allow people to choose tools

R2: The conclusion at the end of page 15 of D4.12 is not convincing (i.e. not conclusive about which technology is best) It might be better to write what you just said: ie say that our products are flexible and write guidelines.

Vidar: it’s a lot easier to do 3d visualisation now than in 2015.

R2. As R1 says, provide a list of recommendations. “These are not firm recommendations but …”

Plan to write a blog post on this.

Next highlights:

(R1): comment on KPIs in general: you have a webpage listing the KPIs, but I could not find the value associated to each KPI. Am I doing something wrong?

(Nicolas): we have worked hard on KPIs for this review, we now need to centralize them as you suggest.

COOL THAT THE TIMER ON THE OTHER SCREEN IS ALSO DONE IN JUPYTER - SOMEONE MAY POINT THAT OUT :)

To summarize: made life easier for people inside ODK and outside ODK.

(R1,R3): really good work end: 12:14

(R2): one small one D4.11 (MathHub) expect ready at end of November - is it ready now? (Michael): will talk about it; still needs love and attention but it is running and we’re not very embarrassed by it.

Michael: these project reviews are very good motivators

(R2): The particular area of computational mathematics is the core of what we funded the project to do.

End questions: 12:16

WP3: Component Architecture

Lead: Luca De Feo

Start: 12:17

Tool: slides = Jupyter notebook running on Binder!

All participants in WP3, please stand up.

Binder starting…

Note: CoCalc does not give you anything like this.

Nicolas: Binder now supports more than GitHubs

Binder is running on Cloud infrastructure all over the world. The reason it’s slow to start is everyone is using it: it’s very popular!

(R2): what is the speedup of cygwin native version vs virtual machine solution?

The virtual machine had an overhead of running the virtual machine, which often made this option non usable.

Additionally, the additional step of installing VirtualBox or VMware was a barrier.

(Erik) It does not matter so much in term of CPU computational intensive things. Now there’s a desktop icon that you click.

(Luca): this is more about usability at all than about speed.

(Alex) You may want to interact with native Windows applications, and that is hard if SageMath is in the VM

(R1) Why such a variation in the number of users? -> not users but clicks on SageMath website to download Sage-Windows

Spikes are corresponding to releases silent answer (Clement): and seasonal variations (low download rate in July-Aug)

(R2): cocalc and ODK, isn’t some duplication effort? How could be more effective this collaboration. (Luca/Nicolas): cocalc is more exploratory / our work is adding sustainability many different answers and experiences (Viviane): Binder and CoCalc are both useful and answer different needs

(R3): information on geographical spread of downloads and installation?

(ODK): Binder likely has such statistics, we should get them

12:59 end talk.

Let’s go eat.

WP5: High Performance Mathematical Computing

Lead: Clément Pernet

Start: 14:04

(R1): HPC is not the same as cloud, do you help your users with recommendations?

(Clément): we have code for simd, multicore, hpc; ideas for cloud not translated into code yet

(Luca): Suggestion for next iteration of such a presentation: use JupyterLab when Vincent is showing fibonacci things

Clément again

(suggestion) The slide should say: 7 papers published, more in preparation

(R1): why MPI and not OpenMP

(Clément): we actually use both

(R1): ok; and does it make sense to use GPUs in this setting

(Clément): yes; not clear we can offer something but working on it

(R3): do you work with people who do non-linear solving

(R1): nice publications but ]

Vincent again

(R2): is this the same table as figure xxx in the deliverable

Why only 2.8 speedup for 8 cores

(Vincent): because of the unbalancedness of the tree; a lot of time is used in communication to reallocate tasks to workers

(R2): The deliverable needs to make that clear.

We discussed that last time and I don’t see any amendments.

(R2): how small are these objects

(Vincent): only up to permutations of 16 elements but if you iterate on all of them that’s factorial(16) objects; with SIMD this takes one hour instead of years

For larger permutations nothing; maybe in the future if CPU registers become larger.

(R3): tell us more about this superlinear speedup

(Clément): this tells us the 1-core implementation is probably inefficient.

(R1): what are yoru plans regarding Cilk

(Clément): we did exploratory work on Cilk, but had no commitment to Cilk.

There is a competitor to Cilk; there is a competitor developed at Uni Barcelona. We can experiment more.

14:44 end talk.

13:00 Lunch in the premises, with demonstrators and posters

This took place before WP5 talk.

14:30 Achievements, by work package, session 2

WP6: Data/Knowledge/Software-Bases

Lead: Michael Kohlhase

Start: 14:44.

that’s the end of my talk; questions? : )

(R3): are you trying to make up for the gender imbalance by having female characters in your user stories?

(Michael): we actually hired a female for the data aspects

(Nicolas): we need female characters in user cases so female users can identify

(silent Samuel): we also have an actual user story: Andrea wanted to call Sage from GAP; she ended up using Homalg GAP package for that. She needed GAP because her PhD advisor, Gabriela, and former PhD students of Gabriela, including Myriam, had programmed a lot of things in GAP. Andrea Thevis; Gabriela Weitze-Schmithüsen; Myriam Finster. She learnt about the Homalg way in GAP days Fall 2018 in Siegen.

“Steve who naturally prefers GAP” :)

Future work: make PARI/GP MitM-compliant

Use case on LMFDB

KPIs and deliverables

(R1): it seems the logipedia success story should be included in your impact reports

(Michael): Logipedia was released yesterday, give us some time.

(R2): very interesting; both Logipedia and interest from theorem-proving community Explain in layman terms what this is about.

(Michael): a blog post is the least we want to do. Florian and I are from the theorem proving community originally. We want to make this a big story.

(R2): make sure it goes into the deliverables.

(Michael): it seems to us you are suggesting we update the deliverable?

(R1): no: use the deliverables for the next reporting period.

(Nicolas): should we update deliverables with your previous suggestions?

(R2): I would like you to.

(R1): maybe do some comic about the math-in-the-middle

(Michael): absolutely: i would love to have my first comic

(Michael): I hope we will be able to continue at the European level.

The Jane story is maybe a little limited. If you think of what MitM has to offer to the computer algebra community… There is slight opposition to that. Whereas in the databases area, there is work to do.

End comments 15:17.

WP2: Community building, training, dissemination, exploitation and outreach

Lead: Viviane Pons

Start: 15:17

(silent Alex): many learners learn Jupyter at Software/Data Carpentry workshops: see e.g. lessons at https://software-carpentry.org/lessons/ Making more robust and usable Jupyter helps to “Carpenters”, and we also get feedback from them.

(silent Samuel+1): the organisation is now “The Carpentries” after merger of Software Carpentry and Data Carpentry.

(silent Samuel): Lelière -> Lelièvre

(R1, R2): if some Software Carpentry work has been funded by OpenDreamKit then this should be acknowledged on the Carpentries website.

The Carpentries has a page of friend organisations, OpenDreamKit could be listed there.

Alex: https://software-carpentry.org/join/projects/ lists GAP already (submitted by me). I will suggest adding OpenDreamKit there.

(R3): Why only twitter? What about Linkdn?

(Viviane): I use Twitter, and I know that there is an open science community there, and it has been efficient for me in the past. I haven’t yet been convinced that LinkedIn would be as useful.

15:45 Wrap up

Nicolas M. Thiéry

(PO): Final review - within 60 days after the end of the project. After the end of the projct only costs related to the final review are eligible.

() What can be delivered next week concerning finances? By tuesday, at least two tables for level absorbed so far from the begining of the period until now per organization Any important change in the finances should be reported as a note to the PO

Nicolas:

16:00 Assessment preparation by Reviewers and the Project Officer

17:00 Feedback and discussion with reviewers

(R1) this pad is public so we don’t include reviewer names lead reporter for this round of assessment

Nice review. Interesting and clarifying.

The project has achieved most of its objectives with minor deviations

We encourage you to .. final review.

From the technical point of view we consider you made very good technical work.

Congratulatiosn for this. Demo clarifyin

Impact is higher than expected, this is good and not so good.

Very good management work considering difficulty of managing large project with many goals and challenging goals. However some delays with the financial reports although they didn’t have major impact on the project and were solved finally

The financial statement is pending, submit it by Tuesday.

Recoommandation for the current period: reopen one deliverable D5.11 (justify timings + typos)

also try to improve a little bit the presentations which have some mistakes. Àlso inlcude centralized KPI:

Recommendation for next reveiwing period. Financial statement available 15 days before formal review.

(R2) you have done extremely well

(R3) additional dimension to mathematical reearch new theorem? maybe for next time Informal review some time ago: long list of deliverables, long list of workshops we see you reduced number of deliverables but not the number of workshops congratulations for what had been achieved

clap clap clap

18:00 End of review

Question: can someone share the Timer code please?

from ipywidgets import *
import time

def timer(total, start=0):
    total = int(total)
    w1 = FloatProgress(max=total, bar_style='info', orientation='horizontal',
                       layout=Layout(width="950px", height="200px"))
    w2 = HTML(layout=Layout(height="100px"))
    w = VBox([w1, w2])
    display(w)
    
    T0 = time.time() - start
    while True:
        T1 = time.time()
        D = T1 - T0
        w1.value = D
        remain = total - D
        if remain >= 0:
            sign = "&nbsp;"
        else:
            sign = "-"
            remain = -remain
        remain_minutes = int(remain // 60)
        remain_seconds = int(remain - remain_minutes * 60)
        w2.value = "<strong style='font-size:130px'>&nbsp;{}{}:{:02}</strong>".format(sign, remain_minutes, remain_seconds)
        
        fraction = D / total
        if fraction < 0.8:
            w1.style.bar_color = "#66ff66"
        elif fraction < 0.9:
            w1.style.bar_color = "#ffe000"
        elif fraction <= 1:
            w1.style.bar_color = "#ff8000"
        else:
            w1.style.bar_color = "#d00000"
        sleeptime = 1.1 - (D - int(D))
        time.sleep(sleeptime)

Debriefing

very good rehearsal for next year next time no financial-report unpleasant moment

Michael

Nicolas

Alex

Michael

Nicolas

John

Alex

Nicolas

Michael

Nicolas

Michael

Nicolas

Michael

Nicolas

Alex

Nicolas

They kept saying “very good” but not “excellent”.

Implications of late financial reports

Nicolas

Nicolas

Remaining deliverables

Michael

Nicolas

Michael

Bill Hart

Marcin

Nicolas

Michael

Marcin

Odile

Michael

Alex

Nicolas

Florian

Michael

Nicolas

Michael

Nicolas

<