Standard | Images | Videos
1 to 10 of 5,353
EvalChatGPT-GenerateRCode-CISOSE2024-Invited+Track.pdf

User Centric Evaluation of Code Generation Tools

With the rapid advance of machine learning (ML) technology, large language models (LLMs) are increasingly explored as an intelligent tool to generate program code from natural language specifications. However, existing evaluations of LLMs have focused on their capabilities in comparison with humans. It is desirable to evaluate their usability when deciding on whether to use a LLM in software production. This paper proposes a user centric method for this purpose. It includes metadata in the test cases of a benchmark to describe their usages, conducts testing in a multi-attempt process that mimics the uses of LLMs, measures LLM generated solutions on a set of quality attributes that reflect usability, and evaluates the performance based on user experiences in the uses of LLMs as a tool. The paper also reports a case study with the method in the evaluation of ChatGPT’s usability as a code generation tool for the R programming language. Our experiments demonstrated that ChatGPT is highly useful for generating R p…

Type: conference paper
Creators: Miah, Tanha; Zhu, Hong;
Year: 2024
Access: metadataOnlyAccess
Status: Live|Last updated:July 19, 2024 5:33 PM
zero star rating average
0 comments
ScenEval.pdf

ScenEval: A Benchmark for Scenario-Based Evaluation of Code Generation

In the scenario-based evaluation of machine learning models, a key problem is how to construct test datasets that represent various scenarios. The methodology proposed in this paper is to construct a benchmark and attach metadata to each test case. Then a test system can be constructed with test morphisms that filter the test cases based on metadata to form a dataset. The paper demonstrates this methodology with large language models for code generation. A benchmark called ScenEval is constructed from problems in textbooks, an online tutorial website and Stack Overflow. Filtering by scenario is demonstrated and the test sets are used to evaluate ChatGPT for Java code generation. Our experiments found that the performance of ChatGPT decreases with the complexity of the coding task. It is weakest for advanced topics like multi-threading, data structure algorithms and recursive methods. The Java code generated by ChatGPT tends to be much shorter than reference solution in terms of number of lines, while it is mo…

Type: conference paper
Creators: Paul, Debalina Ghosh; Zhu, Hong; Bayley, Ian;
Year: 2024
Access: metadataOnlyAccess
Status: Live|Last updated:July 19, 2024 5:32 PM
zero star rating average
0 comments
SurveyCodeGenEval-final.pdf

Benchmarks and Metrics for Evaluations of Code Generation: A Critical Review

With the rapid development of Large Language Models (LLMs), a large number of machine learning models have been developed to assist programming tasks including the generation of program code from natural language input. However, how to evaluate such LLMs for this task is still an open problem despite of the great amount of research efforts that have been made and reported to evaluate and compare them. This paper provides a critical review of the existing work on the testing and evaluation of these tools with a focus on two key aspects: the benchmarks and the metrics used in the evaluations. Based on the review, further research directions are discussed.

Type: conference paper
Creators: Paul, Debalina Ghosh; Zhu, Hong; Bayley, Ian;
Year: 2024
Access: metadataOnlyAccess
Status: Live|Last updated:July 19, 2024 5:31 PM
zero star rating average
0 comments
oso-9780198867432-chapter-1.pdf

Cleaning up renaissance Italy : environmental ideals and urban practice in Genoa and Venice. Introduction

People and goods from across the globe filled the vibrant ports of Genoa and Venice during the Renaissance. This book takes us onto the streets, bridges, and waterways of these significant, sensuous cities to reveal the ambitious schemes undertaken to promote the cleanliness and health of their communities. Along the way, we encounter a broad and fascinating cross-section of Renaissance society—from courtesans to street-food sellers and architects to canal diggers—and, using new archival sources, uncover both the ideals and lived experiences of health and environmental management. An illuminating and original account of social policies, urban design, and environmental management between 1400 and 1600, Cleaning Up Renaissance Italy provides a new, multidisciplinary history of Renaissance Italy. -- Supplied by publisher.

Type: book part
Creators: Stevens Crawshaw, Jane L.;
Year: 2023
Access: embargoedAccess
Status: Live|Last updated:July 19, 2024 3:28 PM
zero star rating average
0 comments
Calibration in the Disciplines - Geography - 9781032460277 - 2024 - Hill Walkington Page Wyse.pdf

The need for calibration in the disciplines : a case study from geography

The debate about degree outcomes and comparability of academic standards in Higher Education has become increasingly prominent in academic, political and media discussions internationally. This chapter reports on work undertaken as part of the UK Degree Standards Project in collaboration with the Royal Geographical Society (with Institute of British Geographers). The intent of the work was for the geography community in the UK, supported by its learned society and professional body, to respond to public concerns about ‘grade inflation’ in relation to degree outcomes. The chapter first presents data on degree outcomes in the UK and with respect to the subject of geography more specifically. It goes on to report the results of training activities within the discipline across a range of geographical scales to increase the use of calibration as a means of providing transparent assurance about the quality of assessment practices. Despite the collaborative process engendering a sense of ownership and willingness to…

Type: book part
Creators: Hill, Jennifer; Walkington, Helen; Page, Ben; Wyse, Stephanie;
Year: 2024
Access: embargoedAccess
Status: Live|Last updated:July 19, 2024 2:06 PM
zero star rating average
0 comments
Self-concordance theory and the goal-striving reasons framework - 2024 - Ehrlich Cripps Ehrlich.pdf

Self-concordance theory and the goal-striving reasons framework and their distinct relationships with hedonic and eudaimonic well-being

Self-concordance theory and the goal-striving reasons framework both measure the quality of people’s reasons for their goal pursuits. Both have provided substantial evidence for their predictive power for people’s well-being. However, it remains unclear which of the two goal-reason models is the better predictor for different forms of well-being. The paper analyses the distinct relationships of the two models in relation to hedonic well-being (Subjective Well-Being, Life Satisfaction, Affect Balance) and indicators of eudaimonic well-being (Basic Need Satisfaction, Purpose and Self-Acceptance). The findings are based on a cross-sectional, correlative research design based (N = 124). Using multiple regression analyses the results show that the goal-striving reasons framework is overall more strongly associated with hedonic and eudaimonic well-being. However, the differences for hedonic well-being as well as for self-acceptance and purpose are much larger than they are for the three basic needs of autonomy, com…

Type: journal article
Creators: Ehrlich, Christian; Cripps, Karen; Ehrlich, S.;
Year: [in press]
Access: postEmbargoOpenAccess
Status: Live|Last updated:July 19, 2024 1:45 PM
zero star rating average
0 comments
Anaesthesia - 2024 - Gustafson - The impact of musculoskeletal ill health on quality of life and function after critical.pdf

The impact of musculoskeletal ill health on quality of life and function after critical care : a multicentre prospective cohort study

Physical disability is a common component of post-intensive care syndrome, but the importance of musculoskeletal health in this population is currently unknown. We aimed to determine the musculoskeletal health state of intensive care unit survivors and assess its relationship with health-related quality of life; employment; and psychological and physical function. We conducted a multicentre prospective cohort study of adults admitted to intensive care for > 48 h without musculoskeletal trauma or neurological insult. Patients were followed up 6 months after admission where musculoskeletal health state was measured using the validated Musculoskeletal Health Questionnaire score. Of the 254 participants, 150 (59%) had a musculoskeletal problem and only 60 (24%) had received physiotherapy after discharge. Functional Comorbidity Index, Clinical Frailty Scale, duration of intensive care unit stay and prone positioning were all independently associated with worse musculoskeletal health. Musculoskeletal health state m…

Type: journal article
Creators: Gustafson, O.D.; King, E.B.; Schlussel, M.M.; Arnold, A.; Wade, C.; Nicol, P.S.; Rowland, M.J.; Dawes, H.; Williams, M.A.;
Year: 2024
Access: openAccess
Status: Live|Last updated:July 19, 2024 1:21 PM
zero star rating average
0 comments
Novel digitisation method - 2024 - Devi Khandelwal Vidis Plecenik Jabir.pdf

A novel digitisation method for pulse switchable memristive chemical sensors

Memristors, typically considered for their non-volatile resistive memory for high-density memory designs, have shown very good sensitivity to various chemicals. As such, these devices can also be fabricated as chemical sensors with intrinsic memory. When fabricated for sensing chemicals, the switching state of the devices, depending on the amount of the applied bias voltage/current, also changes in the presence of the chemicals, compared to when the chemicals are not present. We have observed that this property can be combined with the device’s intrinsic memory to directly digitise sensed information. To this end, in this paper, we propose an innovative technique to directly digitise the sensor readings, e.g. gas concentrations, simply by pulsing the devices to exploit the memory behaviour in the presence of a chemical to change the state of the device. Essentially, we are relying on the sensors to tell us when a certain property of a chemical is sensed by switching its state while this information is digitis…

Type: conference paper
Creators: Devi, Meenakshi; Khandelwal, Saurabh; Vidis, Marek; Plecenik, Tomas; Jabir, Abusaleh;
Year: [in press]
Access: metadataOnlyAccess
Status: Live|Last updated:July 19, 2024 1:05 PM
zero star rating average
0 comments
CRA-Case+Studies—3_TP.pdf

Poetic testimony

Type: book part
Creators: Kazan, Helene;
Year: 2024
Status: Live|Last updated:July 19, 2024 12:40 PM
zero star rating average
0 comments
Agricultural practices to mitigate food insecurity in Madagascar - 9783384010551 - 2023 - Ralambomanantsoa Ramahatanarivo Donati Eppley Ganzhorn Glos Kübler Ratovonamana Rakotondranary.pdf

Towards new agricultural practices to mitigate food insecurity in southern Madagascar

The south of Madagascar suffers from recurrent droughts with catastrophic effects on the human population and the globally unique biodiversity alike. During these times and shortly thereafter, households from only two out of 24 villages with a total of 374 households achieve food security and most households resort to food resources provided by the remaining forests and fallow land. This poses the question why forest food resources persist and remain available even when agricultural crops fail. The main di!erence seems to be that the majority of agricultural crops are annual plants that need to be replanted for each growing cycle and do not provide anything if the regular harvest fails, while 48 of 50 forest food resources used by people during times of food shortage are perennial and often woody species that can tolerate prolonged droughts. For improved food security, annual crops could be replaced by, or combined with, perennial crops in various agroforestry systems. These agroforestry systems could be desi…

Type: book part
Creators: Ralambomanantsoa, Tiana F.; Ramahatanarivo, Mialitiana E.; Donati, Giuseppe; Eppley, Timothy M.; Ganzhorn, Jörg U.; Glos, Julian; Kübler, Daniel; Ratovonamana, Yedidya R.;
Year: 2023
Access: postEmbargoOpenAccess
Status: Live|Last updated:July 19, 2024 12:15 PM
zero star rating average
0 comments