laitimes

Thoughts on software/code generation

author:Geek Park

**Author: Su Wen

1. Advantages of intelligent code generation

1. The implementation method of software application is mainly code

After the information revolution, software applications map each business behavior to the carrier of the binary world, and then achieve expressible and optimized digital purposes.

Software applications are implemented primarily in code. So far, there are nearly 100 million programmers in the world and nearly 10 million programmers in China.

With the continuous advancement of machine learning and artificial intelligence technology, intelligent code generation is becoming a new focus in the industry. The software code generation model based on large-scale training can not only automatically write high-quality code, but also complete various complex software development tasks, which greatly improves the efficiency and productivity of software development.

All of these are profoundly affecting the software industry. As an increasingly important means of production, the software industry is often faced with an imbalance between demand and supply, and the demand for software consumption may be solved through code generation in the future.

As code generation matures, companies will focus more on iterations of the business itself.

At present, the ability of global AGI to solve the ceiling of code generation is obvious.

According to statistics, the accuracy rate of code generation is between 30% and 40%, so most of the product forms are Copilot based on code completion functions, and many agent products have emerged.

Autopilot, which implements code generation, end-to-end generation of software and applications, is a broader direction for code generation.

In the longer term, "code is not just a product, it's also a path to artificial general intelligence."

2. Advantages of intelligent code generation

Compared with traditional code writing, intelligent code generation has the following outstanding advantages:

Intelligent auto-coding

Through deep learning, the code model can understand the semantics and structure of the code, and automatically generate high-quality code according to the input requirements. Users only need to briefly describe the requirements, and the model can quickly generate code that meets the requirements, greatly reducing the coding workload.

Cross-language support

Advanced code models have achieved cross-programming language code generation capabilities, developers can use natural language to describe requirements, and the model can automatically generate the corresponding code, not limited to a specific programming language, which greatly improves the flexibility and adaptability of software development.

Code optimization and refactoring

The code model can not only generate new code, but also optimize and refactor the existing code to improve the readability, maintainability and efficiency of the code. This is very helpful for code quality control and technical debt management.

Automate software development

Combined with more AI technologies and products, the code model can realize the full automation of software development, from requirements analysis, design, coding to test deployment, the entire software life cycle can be completed by the AI system autonomously, which will greatly improve the speed and quality of software delivery.

2. The impact of intelligent code generation on the software industry

1. The endgame of intelligent code generation: a more focused software generation platform for professional services + automated delivery

Although the discussion on the impact of large-scale model code generation capabilities on the software industry is focused on the impact on the SaaS field, in practice, it should be viewed from a panoramic perspective.

In a nutshell, the endgame of code generation is expected to transform the software industry into two types – more focused on specialized services and software generation platforms that automate delivery.

We'll break it down in a few ways:

There are three categories of software formats: professional services, popular standard products, and enterprise solutions

In McKinsey's 1999 book "Secrets of Software Success", the three categories of software formats are still classic: professional services, popular standard products, and enterprise solutions.

  • Specialized services are limited to marginal costs, difficult to optimize, with a high degree of decentralization, and almost no scale;
  • Popularized standard products represented by various tools and software often show the head effect under full market competition, and the impact mainly comes from the new categories or changes in the form of cloud technology.
  • Enterprise solutions are often the integration of non-standard software platforms and professional services, represented by SAP, Saleforce, UFIDA, Kingdee domestic and foreign software, etc., after decades of development.

At present, most of the overseas SaaS belongs to the second category, and the service is enhanced through multiple means of customer success. Many domestic software companies, especially most SaaS start-ups, often belong to the third category of business development, which focuses on pre-sales, solutions, delivery and operations, and essentially belongs to the software service industry.

2. Why professional services? The essence of software procurement and consumption in China is business code, not products

Why will the first type of professional services benefit the most, if not the end, in the future?

This is decided on 2 points:

  • The essence of software procurement and consumption in China is business code, not products.
  • When the cost of programming approaches zero, individual digital needs may be met.

The European and American markets have given birth to many SaaS software companies, which are either large and beautiful, or small and beautiful, and the standardization of products, charging methods, and deployment methods have formed an envy of Chinese software practitioners and investors.

Looking back at the past 10-20 years, the big party that pays for software in China is mainly the government (G-side) and the big B-side, while the big B-end is mainly composed of state-owned enterprises and a small number of leading private enterprises. This has also caused the SaaS industry in China to be generally unprofitable.

The main reason behind this is that the essence of SaaS consumption in China is the consumption code, not the product.

To put it simply, in actual operation, from the selection period, implementation period to the acceptance period, and then to the maintenance period, the information chain of the software format is long, and the proportion of services remains high.

Specifically:

  • Party A's demand iteration: Party A's needs are diverse due to the differences in its own business, showing a large number of long-tail, non-standard customization needs, and with the development of the business, the supply matching of software must lag behind, even if there is a PaaS platform that can meet 80% of the business needs, but the remaining 20% consumes 80% of the delivery cost.
  • Party A's organization: Party A's digital demand publisher is gossipy, application software is often accompanied by process sorting and reengineering, which is also an inevitable feature of software service enterprises, software development is wrapped in a large number of consulting services, and software users will inevitably put forward various requirements in the delivery process. Due to the lack of digital experience and professionalism on both the supply and demand sides, many users and customers only know how to re-propose or modify the requirements when they see the final delivered software, which makes the problem of software delivery and rework extremely prominent, and the end of the long chain directly leads to the consumption of project profits and embarrassing payment collection data.
  • Party A pays: Most customers who are willing and able to pay are often budget-based, and payment is not strictly in accordance with the essence of software delivery. What is even more embarrassing is the fact that the software demand-side acceptance stage is "not martial", and modifying and increasing the demand is a routine operation, otherwise it will not be accurately collected and not collected, resulting in a large number of software enterprises generally troubled by the delivery of long, less confirmed, and more receivable. The root cause of this is the cost-driven or business-driven non-professional selection of Party A in the selection process, the difficulty of software maintenance brought about by privatization deployment, the poor software technical foundation and over-commitment of Party B, and the imputation system of government/state-owned enterprises after acceptance or payment.

The consumption of software in the Chinese market is objective and huge, but what we see is the customization of consumption code, which can be called consumption code.

When the cost of programming approaches zero, individual digital needs may be met.

3. When the cost of programming tends to be close to zero, change the three core elements of the software supply side, and the personalized digital needs may be satisfied

At present, the purpose of software/code consumption, the digital translation of business needs, is itself accompanied by the characteristics of non-standard, customization, and long-tail, and business needs are developing and growing, and personalization will continue to emerge. Due to the shortage of the supply side, the demand side either makes concessions to the solution and accepts the model of standardization as much as possible, or keeps repeating the wheel of failure in the case of limited budgets, making the survival ecology of the supply side worse.

The application field of code generation has focused on the software service industry for a long time, aiming to separate software from services, so that the implementation of software becomes an automated end-to-end code generation and a standardized tool. The service will become more and more specialized.

The endgame of code generation is expected to transform the software industry into two types – software generation platforms that are more focused on specialized services and automated delivery.

The main reason for realizing the automation from demand PRD to software engineering is to change the three bottlenecks of QCD (Quality, Cost, and Delivery), the three core elements of the software supply side, and finally realize the universality and equality of software consumption.

The three elements of code generation and software provisioning

Nearly 100 million programmers in the world and nearly 10 million programmers in China occupy trillions of dollars in the labor cost of the software industry, of which high-quality programming manpower accounts for a relatively low percentage, and at the same time, in the face of the huge market of software consumption, there is no immediate and efficient response, showing a serious imbalance between supply and demand.

  • Efficiency: In addition to the consulting services in the requirements sorting stage, the subsequent development cycle and the demand addition, change and deletion cycles after the launch are all considered on a weekly and monthly basis, and the marginal cost of software enterprises will rise instead of decreasing. In the endgame of code generation, professional consulting services that are strongly bound to business scenarios still exist, but the subsequent software demand delivery can be turned into minutes/hours, bringing huge efficiency dividends to the software industry.
  • Quality: PC/mobile Internet around cloud-native, big data, algorithm catalyzed by a lot of high-quality infra capabilities, but in the software industry high-quality technology supply is extremely scarce, code generation can greatly reduce the application threshold of various high-end technical capabilities, to achieve technology inclusive, to transform the infrastructure of the software industry.
  • Cost: In the delivery of many software categories, there is a proportion of objective software service costs, and the common charging method is the "man-day" model, where code generation can make the marginal cost of software engineering delivery close to the cost of computing power required for code generation, so that information technology becomes infrastructure rather than superstructure.

    3. Key capabilities and status quo of intelligent code generation

    1. Key capabilities of intelligent code generation

    Through deep learning, the code model can understand the semantics and structure of the code, and automatically generate high-quality code according to the input requirements. Specifically, the code model has the following key capabilities:

    Semantic understanding

    The code model can understand the semantic information contained in the code, including the meaning of functions, variables, control flows, and other levels. This allows the model to generate code that conforms to the business logic based on actual needs.

    Structure-aware

    The model can not only understand the semantics of the code, but also perceive the structural characteristics of the code, such as object-oriented class structure and modular design. This ensures that the generated code is readable and maintainable.

    Contextual modeling

    The code model has powerful contextual modeling capabilities, and can generate appropriate code snippets based on the input requirements and combined with the context information of the code. This contextual awareness helps ensure code coherence and applicability.

    Diversity generation

    The code model can not only generate a single code solution, but also generate a variety of code alternatives, providing developers with more choices. This facilitates the exploration of better coding implementations.

    Object-oriented modeling

    The code model can identify concepts such as classes, objects, inheritance, and polymorphism in object-oriented programming, and generate code that conforms to object-oriented design principles based on these structural characteristics. This ensures that the generated code is reusable and extensible.

    Modular design

    The model can perceive the modular design of the software system and understand the dependencies and interface contracts between different modules. This ensures that the generated code fits well into the existing software architecture and allows for good modular splitting.

    Design pattern recognition

    The code model is able to identify common software design patterns, such as singleton patterns, factory patterns, observer patterns, and so on. It automatically applies appropriate design patterns to generate high-quality, reusable code structures based on actual needs.

    2. There are two modes of current code generation

    At present, there are two main types of product solutions in the field of code generation: Agent based on existing AGI vs End-to-end solution based on dedicated model.

    Most of the participants belong to the former category, and the product form is mostly in the form of plug-ins based on code completion functions, almost all mainstream LLM companies will provide code capability products, and there are also many independent agent products emerging, representative companies are Github Copilot, Cursor, August, Cognition, etc. According to statistics, the accuracy rate of code generation is between 30% and 40%.

    "Autopilot", which implements code generation, end-to-end generation of software and applications, is a more final technical direction for code generation. This kind of solution is to solve the underlying technology stuck point of AGI in the direction of accurate code generation, and based on transformer, self-developed more advanced model architecture, so that the commercialization scenario of large models can move towards the scenario of low fault tolerance, and the representative companies are Poolside, Magic, and AIGCode.

Read on