Myah  Conn

Myah Conn

1592610120

Make Computational Research Reproducible Wide-Spread with R Packages

Importance of making research reproducible
How to make research reproducible
Impact of R packages in the computational research
A nation is known by its research contributions and the whole world is moving forward in the direction to contribute research for the betterment of the society. The quality of research is more important than its quantity, and therefore, the impact of research in society is more important than the number of research articles published. In this post, I am sharing my observation related to computational research and numerous possibilities in it.
All over the globe, several researchers are working for their Ph.D. and other research roles. Due to advancements in computational capabilities, many of such researchers are working in computational research in their respective domains. Many of them must be proposing and publishing their excellent computational research contributions. But what will be the future of these contributions?

#r #python #phd #reproducible-research #data-science

What is GEEK

Buddha Community

Make Computational Research Reproducible Wide-Spread with R Packages
Myah  Conn

Myah Conn

1592610120

Make Computational Research Reproducible Wide-Spread with R Packages

Importance of making research reproducible
How to make research reproducible
Impact of R packages in the computational research
A nation is known by its research contributions and the whole world is moving forward in the direction to contribute research for the betterment of the society. The quality of research is more important than its quantity, and therefore, the impact of research in society is more important than the number of research articles published. In this post, I am sharing my observation related to computational research and numerous possibilities in it.
All over the globe, several researchers are working for their Ph.D. and other research roles. Due to advancements in computational capabilities, many of such researchers are working in computational research in their respective domains. Many of them must be proposing and publishing their excellent computational research contributions. But what will be the future of these contributions?

#r #python #phd #reproducible-research #data-science

Brain  Crist

Brain Crist

1594661520

The ART of Reproducible Research

“Reproducibility” might seem an odd word, but the oddeness it brings is just as important as the benefits this big word delivers to the academic society.

To understand the full meaning behind this term, first we need to understand its motivations. When producing a research, a pillar of the scientific process is to validate the findings of this research. The common methods to show these findings is by publishing scientific articles on respected conferences and journals, that will peer review the work and judge it if it is good enough to meet the standards of the publication place. But one thing that is missing in this whole process is the validation and assurance that such findings are indeed obtained by the methodology claimed, and even further, that, if done again, the results obtained will be the same. So in order to verify the results, a research shall be reproducible, that means, a research needs to be executed again, with the same environment, data and code to, after all, get to the same result as the original authors claimed.

“Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do.” — Donald Knuth

Having that in mind, we can now see that providing such elements that facilitates one to be able to execute again a research is of big importance. This guide tries to explain a little of the best practices compiled when building a reproducible research, based on the literature and on personal experiences I here share some tips, tricks and hacks on how to make a research that others can understand and reproduce.

The core of reproducibility is to make available the data, code, distribution, documentation and workflow [1] related to a research. In this guide, I will expose my thoughts, tips and tricks on these items. Each key element provides a theory that enhances the chance of one being able to reproduce other’s research. In practice, there elements can be translate into tools, frameworks, organization and best practices of coding, to get to the level of making it easy the steep process that is to execute someone else’s code.

To start easy, the first key element I’ll introduce for building a reproducible research is the documentation. It involves all texts accompanying the work done, even the paper. I consider the documentation the main element not only for reproducibility, but also for any piece of code developers shall make.

#reproducibility #code #research #best-practices #reproducible-research

Marcus  Flatley

Marcus Flatley

1594399440

Getting Started with R Markdown — Guide and Cheatsheet

In this blog post, we’ll look at how to use R Markdown. By the end, you’ll have the skills you need to produce a document or presentation using R Mardown, from scratch!

We’ll show you how to convert the default R Markdown document into a useful reference guide of your own. We encourage you to follow along by building out your own R Markdown guide, but if you prefer to just read along, that works, too!

R Markdown is an open-source tool for producing reproducible reports in R. It enables you to keep all of your code, results, plots, and writing in one place. R Markdown is particularly useful when you are producing a document for an audience that is interested in the results from your analysis, but not your code.

R Markdown is powerful because it can be used for data analysis and data science, collaborating with others, and communicating results to decision makers. With R Markdown, you have the option to export your work to numerous formats including PDF, Microsoft Word, a slideshow, or an HTML document for use in a website.

r markdown tips, tricks, and shortcuts

Turn your data analysis into pretty documents with R Markdown.

We’ll use the RStudio integrated development environment (IDE) to produce our R Markdown reference guide. If you’d like to learn more about RStudio, check out our list of 23 awesome RStudio tips and tricks!

Here at Dataquest, we love using R Markdown for coding in R and authoring content. In fact, we wrote this blog post in R Markdown! Also, learners on the Dataquest platform use R Markdown for completing their R projects.

We included fully-reproducible code examples in this blog post. When you’ve mastered the content in this post, check out our other blog post on R Markdown tips, tricks, and shortcuts.

Okay, let’s get started with building our very own R Markdown reference document!

R Markdown Guide and Cheatsheet: Quick Navigation

1. Install R Markdown

R Markdown is a free, open source tool that is installed like any other R package. Use the following command to install R Markdown:

install.packages("rmarkdown")

Now that R Markdown is installed, open a new R Markdown file in RStudio by navigating to File > New File > R Markdown…. R Markdown files have the file extension “.Rmd”.

2. Default Output Format

When you open a new R Markdown file in RStudio, a pop-up window appears that prompts you to select output format to use for the document.

New Document

The default output format is HTML. With HTML, you can easily view it in a web browser.

We recommend selecting the default HTML setting for now — it can save you time! Why? Because compiling an HTML document is generally faster than generating a PDF or other format. When you near a finished product, you change the output to the format of your choosing and then make the final touches.

One final thing to note is that the title you give your document in the pop-up above is not the file name! Navigate to File > Save As.. to name, and save, the document.

#data science tutorials #beginner #r #r markdown #r tutorial #r tutorials #rstats #rstudio #tutorial #tutorials

Brain  Crist

Brain Crist

1597338000

Testing young children’s computational thinking

Computational thinking (CT) comprises a set of skills that are fundamental to computing and being taught in more and more schools across the world. There has been much debate about the details of what CT is and how it should be approached in education, particularly for younger students.

A girl doing digital making on a tablet

In our research seminar this week, we were joined by María Zapata Cáceres from the Universidad Rey Juan Carlos in Madrid. María shared research she and her colleagues have done around CT. Specifically, she presented work on how we can understand what CT skills young children are developing. Building on existing work on assessing CT, she and her colleagues have developed a reliable test for CT skills that can be used with children as young as 5.

María Zapata Cáceres

Why do we need to test computational thinking?

Until we can assess something, María argues, we don’t know what children have or haven’t learned or what they are capable of. While testing is often associated with the final stages in learning, in order to teach something well, educators need to understand where their students’ skills are to know what they are aiming for them to learn. With CT being taught in increasing numbers of schools and in many different ways, María argues that it is imperative to be able to test learners on it.

Screenshot from an online research seminar about computational thinking with María Zapata Cáceres

How was the test developed?

One of the key challenges for assessing learning is knowing whether the activities or questions you present to learners are actually testing what you intend them to. To make sure this is the case, assessments go through a process of validation: they are tried out with large groups to ensure that the results they give are valid. María’s and her colleagues’ CT test for beginners is based on a CT test developed by researcher Marcos Román González. That test had been validated, but since it is aimed at 10- to 16-year-olds, María and her colleagues needed to adapt it for younger children and then validate the adapted rest.

Developing the first version

The new test for beginners consists of 25 questions, each of which has four possible responses, which are to be answered within 40 minutes. The questions are of two types: one that involves using instructions to draw on a canvas, and one that involves moving characters through mazes. Since the test is for younger children, María and her colleagues designed it so it involves as little text as possible to reduce the need for reading; instead the test includes self-explanatory symbols.

Screenshot from an online research seminar about computational thinking with María Zapata Cáceres

Developing a second version based on feedback

To refine the test, the researchers consulted with a group of 45 experts about the difficulty of the questions and the test’s length of the test. The general feedback was very positive.

Drawing on the experts’ feedback, María and her colleagues made some very specific improvements to the test to make it more appropriate for younger children:

  • The improve test mandates that an verbal explanation be given to children at the start, to make sure they clearly understand how to take the test and don’t have to rely on reading the instructions.
  • In some areas, the researchers added written explanations where experts had identified that questions contained ambiguity that could cause the children to misinterpret them.
  • A key improvement was to adapt the grids in the original test to include pathways between each box of the maze. It was found that children could misinterpret the maze, for example as allowing diagonal moves between squares; the added pathways are visual cues that it clear that this is not possible.

Screenshot from an online research seminar about computational thinking with María Zapata Cáceres

Validating the test

After these improvements, the test was validated with 299 primary school students aged 5-12. To assess the differences the improvements might make, the students were given different version of the test. María and her colleagues found that the younger students benefited from the improvements, and the improvements made the test more reliable for testing students’ computational thinking: students made fewer errors due to ambiguity and misinterpretation.

Statistical analysis of the test results showed that the improved version of the test is reliable and can be used with confidence to assess the skills of younger children.

What can you use this test for?

Firstly, the test is a tool for educators who want to assess the skills young people have and develop over time. Secondly, the test is also valuable for researchers. It can be used to perform projects that evaluate the outcomes of different approaches to teaching computational thinking, as well as projects investigating the effectiveness of specific learning resources, because the test can be given to children before and again after they engage with the resources.

Assessment is one of the many tools educators use to shape their teaching and promote the learning of their students, and tools like this CT test developed by María and her colleagues allow us to better understand what children are learning.

Find out more & join our next seminar

The video and slides of María’s presentation are available on our seminars page. To find out more about this test, and the process used to create and validate it, read the paper by María and her colleagues.

Our final seminar of this series takes place Tuesday 28 July before we take a break for the summer. In the session, we will explore gender balance in computing, led by Katharine Childs, who works on the Gender Balance in Computing research project at the Raspberry Pi Foundation. You can find out more and sign up to attend for free on our Computing Education Research Seminars page.

#education #research #computational thinking #computing education #research seminar #assessment

George  Koelpin

George Koelpin

1603256400

The Most Underrated R packages

In my experience as an R user, I’ve come across a lot of different packages and curated lists. Some are in my bookmarks like the great awesome-R list, or the monthly “best of” list curated by R studio. If you don’t know them, go check them out asap.

In this post, I’d like to show you something else. These are the results of late-night GitHub/Reddit browsing, and cool stuff shared by colleagues.

Some of these packages are really unique, others are just fun to use and real underdogs among the data scientist/statistician I’ve worked with.

Let’s start!

💥Misc (the weird ones) 💥

  • BRRR** and b**eepr: Have you ever wanted to know — and celebrate — when your simulations are finally done running in R? Have you ever been so proud of pulling off a tricky bit of code that you wanted Flavor Flav to yell “yeaaahhhh, boi!!” as soon as it successfully completes?
  • calendR:Ready to print monthly and yearly calendars made with ggplot2.
  • checkpoint: It makes it possible to install package versions from a specific date in the past as if you had a CRAN time machine.
  • DataEditR: DataEditR is a lightweight package to interactively view, enter or edit data in R.
  • Drake:It analyzes your workflow, skips steps with up-to-date results, and orchestrates the rest with optional distributed computing. In the end, drake provides evidence that your results match the underlying code and data, which increases your ability to trust your research
  • flow:Visualize as flow diagrams the logic of functions, expressions or scripts and ease debugging.

#analytics #data-science #r #statistics #r-package