DuetCS: Code Style Transfer through Generation and Retrieval

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Binger Chen
  • Ziawasch Abedjan

Organisationseinheiten

Externe Organisationen

  • Technische Universität Berlin
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des Sammelwerks2023 IEEE/ACM 45th International Conference on Software Engineering
UntertitelICSE 2023
Herausgeber (Verlag)IEEE Computer Society
Seiten2362-2373
Seitenumfang12
ISBN (elektronisch)9781665457019
ISBN (Print)978-1-6654-5702-6
PublikationsstatusVeröffentlicht - 2023
Veranstaltung45th IEEE/ACM International Conference on Software Engineering - Melbourne, Australien
Dauer: 14 Mai 202320 Mai 2023

Publikationsreihe

NameProceedings - International Conference on Software Engineering
ISSN (Print)0270-5257

Abstract

Coding style has direct impact on code comprehension. Automatically transferring code style to user's preference or consistency can facilitate project cooperation and maintenance, as well as maximize the value of open-source code. Existing work on automating code stylization is either limited to code formatting or requires human supervision in pre-defining style checking and transformation rules. In this paper, we present unsupervised methods to assist automatic code style transfer for arbitrary code styles. The main idea is to leverage Big Code database to learn style and content embedding separately to generate or retrieve a piece of code with the same functionality and the desired target style. We carefully encode style and content features, so that a style embedding can be learned from arbitrary code. We explored the capabilities of novel attention-based style generation models and meta-learning and implemented our ideas in DUETCS. We complement the learning-based approach with a retrieval mode, which uses the same embeddings to directly search for the desired piece of code in Big Code. Our experiments show that DUETCS captures more style aspects than existing baselines.

ASJC Scopus Sachgebiete

Zitieren

DuetCS: Code Style Transfer through Generation and Retrieval. / Chen, Binger; Abedjan, Ziawasch.
2023 IEEE/ACM 45th International Conference on Software Engineering: ICSE 2023. IEEE Computer Society, 2023. S. 2362-2373 (Proceedings - International Conference on Software Engineering).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Chen, B & Abedjan, Z 2023, DuetCS: Code Style Transfer through Generation and Retrieval. in 2023 IEEE/ACM 45th International Conference on Software Engineering: ICSE 2023. Proceedings - International Conference on Software Engineering, IEEE Computer Society, S. 2362-2373, 45th IEEE/ACM International Conference on Software Engineering, Melbourne, Australien, 14 Mai 2023. https://doi.org/10.1109/ICSE48619.2023.00198
Chen, B., & Abedjan, Z. (2023). DuetCS: Code Style Transfer through Generation and Retrieval. In 2023 IEEE/ACM 45th International Conference on Software Engineering: ICSE 2023 (S. 2362-2373). (Proceedings - International Conference on Software Engineering). IEEE Computer Society. https://doi.org/10.1109/ICSE48619.2023.00198
Chen B, Abedjan Z. DuetCS: Code Style Transfer through Generation and Retrieval. in 2023 IEEE/ACM 45th International Conference on Software Engineering: ICSE 2023. IEEE Computer Society. 2023. S. 2362-2373. (Proceedings - International Conference on Software Engineering). doi: 10.1109/ICSE48619.2023.00198
Chen, Binger ; Abedjan, Ziawasch. / DuetCS : Code Style Transfer through Generation and Retrieval. 2023 IEEE/ACM 45th International Conference on Software Engineering: ICSE 2023. IEEE Computer Society, 2023. S. 2362-2373 (Proceedings - International Conference on Software Engineering).
Download
@inproceedings{6c8789cd92504d788df063d0c3a240db,
title = "DuetCS: Code Style Transfer through Generation and Retrieval",
abstract = "Coding style has direct impact on code comprehension. Automatically transferring code style to user's preference or consistency can facilitate project cooperation and maintenance, as well as maximize the value of open-source code. Existing work on automating code stylization is either limited to code formatting or requires human supervision in pre-defining style checking and transformation rules. In this paper, we present unsupervised methods to assist automatic code style transfer for arbitrary code styles. The main idea is to leverage Big Code database to learn style and content embedding separately to generate or retrieve a piece of code with the same functionality and the desired target style. We carefully encode style and content features, so that a style embedding can be learned from arbitrary code. We explored the capabilities of novel attention-based style generation models and meta-learning and implemented our ideas in DUETCS. We complement the learning-based approach with a retrieval mode, which uses the same embeddings to directly search for the desired piece of code in Big Code. Our experiments show that DUETCS captures more style aspects than existing baselines.",
author = "Binger Chen and Ziawasch Abedjan",
note = "Funding Information: This work was funded by the German Ministry for Education and Research as BIFOLD - Berlin Institute for the Foundations of Learning and Data (ref. 01IS18025A and ref. 01IS18037A). ; 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023 ; Conference date: 14-05-2023 Through 20-05-2023",
year = "2023",
doi = "10.1109/ICSE48619.2023.00198",
language = "English",
isbn = "978-1-6654-5702-6",
series = "Proceedings - International Conference on Software Engineering",
publisher = "IEEE Computer Society",
pages = "2362--2373",
booktitle = "2023 IEEE/ACM 45th International Conference on Software Engineering",
address = "United States",

}

Download

TY - GEN

T1 - DuetCS

T2 - 45th IEEE/ACM International Conference on Software Engineering

AU - Chen, Binger

AU - Abedjan, Ziawasch

N1 - Funding Information: This work was funded by the German Ministry for Education and Research as BIFOLD - Berlin Institute for the Foundations of Learning and Data (ref. 01IS18025A and ref. 01IS18037A).

PY - 2023

Y1 - 2023

N2 - Coding style has direct impact on code comprehension. Automatically transferring code style to user's preference or consistency can facilitate project cooperation and maintenance, as well as maximize the value of open-source code. Existing work on automating code stylization is either limited to code formatting or requires human supervision in pre-defining style checking and transformation rules. In this paper, we present unsupervised methods to assist automatic code style transfer for arbitrary code styles. The main idea is to leverage Big Code database to learn style and content embedding separately to generate or retrieve a piece of code with the same functionality and the desired target style. We carefully encode style and content features, so that a style embedding can be learned from arbitrary code. We explored the capabilities of novel attention-based style generation models and meta-learning and implemented our ideas in DUETCS. We complement the learning-based approach with a retrieval mode, which uses the same embeddings to directly search for the desired piece of code in Big Code. Our experiments show that DUETCS captures more style aspects than existing baselines.

AB - Coding style has direct impact on code comprehension. Automatically transferring code style to user's preference or consistency can facilitate project cooperation and maintenance, as well as maximize the value of open-source code. Existing work on automating code stylization is either limited to code formatting or requires human supervision in pre-defining style checking and transformation rules. In this paper, we present unsupervised methods to assist automatic code style transfer for arbitrary code styles. The main idea is to leverage Big Code database to learn style and content embedding separately to generate or retrieve a piece of code with the same functionality and the desired target style. We carefully encode style and content features, so that a style embedding can be learned from arbitrary code. We explored the capabilities of novel attention-based style generation models and meta-learning and implemented our ideas in DUETCS. We complement the learning-based approach with a retrieval mode, which uses the same embeddings to directly search for the desired piece of code in Big Code. Our experiments show that DUETCS captures more style aspects than existing baselines.

UR - http://www.scopus.com/inward/record.url?scp=85171738729&partnerID=8YFLogxK

U2 - 10.1109/ICSE48619.2023.00198

DO - 10.1109/ICSE48619.2023.00198

M3 - Conference contribution

AN - SCOPUS:85171738729

SN - 978-1-6654-5702-6

T3 - Proceedings - International Conference on Software Engineering

SP - 2362

EP - 2373

BT - 2023 IEEE/ACM 45th International Conference on Software Engineering

PB - IEEE Computer Society

Y2 - 14 May 2023 through 20 May 2023

ER -