laitimes

Overseas product operation - on-line testing based on personal understanding

In this article, we'll take a deep dive into how to conduct effective go-live testing to ensure that every step of the way contributes to the success of the final product. Read on to learn about the core metrics of testing, the fence metrics, and the three main testing methods to make your product stand out in a competitive market.
Overseas product operation - on-line testing based on personal understanding

After having a system that you can use flexibly, the next problem you need to consider is how to go online testing, the testing method seems to be very complicated, in fact, as long as you understand the core and straighten out your thinking, the test is as easy as a finger.

01 Core Indicators and Fence Indicators

First of all, the focus of the test is what to measure, this item to be tested, is the "core index" of this test, in the test to pay special attention to the number of core indicators, in a test in principle only one core index, but in practice in special circumstances allow flexibility.

For example, if data backup and disk formatting are modified in one version at the same time, the correlation between the two points is very low, and the data fluctuation can be recorded as two different core indicators in the same version, and analyzed in combination with subsequent data.

The performance of the core indicators often determines the success of the test, and its data fluctuations also reflect the actual feelings of the user and the final impact of the modification on the product.

Overseas product operation - on-line testing based on personal understanding

A change in a product often affects more than just the change point itself, so it is necessary to pay attention to more secondary indicators, that is, "fence indicators".

While this is discouraged, it is true that the final revenue of the product, such as conversion rate or average user value, can be used as a fence metric when testing. There can be multiple fence metrics in each test, which may be closely related to the core metrics or simply because they have an important place in the product.

Fence indicators generally do not directly reflect the test results, but they are necessary for the recording of abnormal data records, or the explanation of fluctuations in core indicators, as well as data support.

02 Test Method

After determining what to test, it depends on how to test, and there are three test methods that are more commonly used in work.

Overseas product operation - on-line testing based on personal understanding

1. Go online directly, compare before and after

This method is probably the most commonly used for small to medium-sized products, and is often used in daily work because of the limited sample size of the product being tested or the low importance of the test.

Because no matter how rigorously A/B comparison testing is carried out, the testing time required for these products may often be months or even years, which is unrealistic in practice. However, the accuracy of the results obtained by this test is not 100% certain, because the difference in time period means that there are likely to be other variables besides the product itself.

For example, if you conduct a comparison test before and after the time period during Christmas or a large promotion season, it is generally not meaningful, because with the arrival of the promotion season, the user's conversion willingness and conversion ability may change very much, and the data fluctuation during this period is likely not caused by product changes or the main reason for data fluctuations is not product changes.

Therefore, if you use the direct online and before-and-after comparison method for testing, you must first determine whether this time period can be tested, and then control the time and variables, and collect enough samples to calculate and avoid accidental situations; Secondly, after the end of the test, it is still necessary to continue to track the data for a certain period of time to confirm the test results.

2. A/B comparison testing

A/B comparison test is a relatively more accurate way to test, that is, two versions A and B are launched at the same time, users are randomly assigned to the two versions according to the ratio of 1:1, and then the performance of the two versions A and B is observed.

This test method has certain requirements for the sample size of the test, because all the samples participating in the test should be divided into two equal parts, and if the data volume is still A/B in the case of a small amount, it is likely to lead to too many accidents in the test process; In general, the larger the sample size, the more accurate the test results will be.

Before the test, we can calculate the required sample size on a professional website (such as evanmiller.org) and use it as a reference, and then know what we need to do so. It doesn't matter if we can't fully reach the required sample size, because it can be judged based on the performance of the core indicators in combination with the fence indicators.

3. Grayscale test

When you are worried that this modification will have a greater impact on users or the market, and you cannot determine the reaction of users or the market, grayscale release is a good choice. The focus of grayscale release is how to control the online rhythm and grasp the proportion of users. In general, it is not advisable to look while walking, it is best to set a time period first, such as planning to go online 30% in the first week, increase to 50% in the second week, 70% in the third week, and 100% in the fourth week.

This type of testing is special, and if you can, you need to pay special attention to collecting user feedback and evaluation changes during the test, either passively waiting for users to submit feedback, or actively asking users for opinions and suggestions, and then depending on the data fluctuations and feedback, to see if adjustments need to be made.

03 Test results

In general, the test results are generally as follows:

Overseas product operation - on-line testing based on personal understanding
  1. If both the core indicators and the fence indicators are positive and exceed expectations, it means that expectation management needs to be optimized.
  2. If the core indicators and fence indicators are both positive and in line with expectations, it means that the project has practical value and significance, and the review and combing online process is started, which can be promoted to other products for testing separately;
  3. If the performance of the core indicators is inconsistent with the fence indicators, and the test results are abnormal, you need to analyze the data in detail, if there is an abnormality, you need to find out the anomaly, adjust and retest, and if it is for other reasons, you need to analyze and adjust it specifically.
  4. If both the core indicator and the fence indicator are negative, you need to investigate the specific reasons and consider continuing to push forward or adjust the retest or abandon the test for rollback.

04 Special Considerations

There are still a lot of pitfalls in product testing, and here are a few important points that need special attention.

Overseas product operation - on-line testing based on personal understanding

1. On-line verification

Be sure to confirm whether it's really online. This is especially important!

2. Data Tracking

Before the test, it is necessary to do a good job in the relevant information collection and event burying work, only when the data tracking is fully prepared and the data is obtained, can an accurate test result be obtained.

3. Modify the control of the point

Not all changes have to be tested, and it is not necessary to change all the points in one test. In general, the fewer points you modify, the more accurate the results of the test.

If several interrelated points are modified in a single revision, then extra caution is required in the analysis of the data of the test, and conclusions should not be drawn lightly.

4. What to do if the effect is not good

Before going live, you should "leave a way back" for testing, and prepare a mechanism for withdrawal or regression in advance. However, this does not mean that the test should be canceled at the slightest hint of trouble, and each test should try to observe the full cycle, and the length of this cycle needs to be confirmed before the test, which can be a week or two weeks.

The main stage of product operation is basically over, and the last content left is the review, and the next article will specifically explain the review work after the product operation plan is launched.

Author: Wu Tong, public account: Ermeow's stupid minion

This article was originally published by @吴桐 on Everyone is a Product Manager. Reproduction without the permission of the author is prohibited.

The title image is from Unsplash and is licensed under CC0.

The views in this article only represent the author's own, everyone is a product manager, and the platform only provides information storage space services.

Read on